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COMPUTATIONAL  EFFORTS  -  INTRODUCTION 

During  the  first  year  of  the  joint  ONR-RR  turbine  project,  computational 
efforts  have  concentrated  on  the  areas  of  minicomputer-array  processor  de¬ 
velopment  and  two-dimensional  computer  code  algorithm  development.  This 

annual  report  is  intended  to  be  a  comprehensive  review  of  the  work  performed 

/ 

/ 

in  these  areas  as  well  as  an  analysis  as  to  what  the  information  /gained 

\  ■ 

during  this  year  implies  for  the  future  direction  of  the  project.'-'  The 
particular  areas  covered  are;  1)  a  detailed  review  of  the  minicomputer-array 
processor  performance  and  prospects,  2)  a  summary  of  the  status  and  importance 
of  boundary  treatments  for  implicit  numerical  algorithms,  3)  a  summary  of 
the  status  of  grid  generation  efforts  for  turbine  type  geometries  and  4)  a 
review  of  work  done  to  analyze  the  application  of  MacCormack's  new  implicit 
time  marching  scheme  to  turbomachinery  geometries.  In  addition  a  computa¬ 
tional  development  plan  for  the  next  three  years  is  presented,  and  the 
present  status  of  test  examples  for  the  two-dimensional  analysis  code  is 
reviewed.  — 

The  minicomputer-array  processor  review  largely  appeared  in  an  earlier 
quarterly  report  and  is  repeated  here  for  completeness.  The  work  on  implicit 
boundary  treatments  was  presented  at  a  boundary  condition  symposium  conducted 
at  NASA  Ames  Research  Center  in  June  1981  and  an  exp’anded  version  is  expected 
to  be  published  in  the  Journal  of  Computational  Physics.  The  analysis  of 
the  new  MacCormack  scheme  was  presented  at  the  AIAA  20th  Aerospace  Sciences 
Meeting  in  January  1932  as  AIAA  Paper  82-0063  and  has  been  submitted  to  the 
AIAA  Journal  for  publication.  Technical  papers  reviewing  the  computer 
analysis  and  the  results  of  the  grid  generation  scheme  are  under  preparation. 
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MINICOMPUTER-ARRAY  PROCESSOR  PERFORMANCE 
Summary 

One  important  portion  of  the  original  ONR-RR  turbine  project  proposal 
was  the  development  of  a  minicomputer-array  processor  system  with  sufficient 
memory  and  processing  capacity  to  support  the  entire  computational  effort. 
Since  the  success  of  the  computational  project  largely  depended  on  the 
development  of  this  computer  system,  it  was  imperative  to  establish  early 
on  the  capacity  of  the  purchased  system.  This  report  section  reviews  the 
development  of  this  system  and  demonstrates  that  the  original  system  com¬ 
putational  speed  goals  have  been  met.  The  original  computer  concept  used 
host  computer  main  memory  to  provide  storage  for  the  three-dimensional 
problem  data  matrix,  but  the  widespread  introduction  of  the  64K  bit  memory 
chip  has  made  it  possible  to  specify  a  backup  mass  memory  system  which 
greatly  expands  the  storage  capacity  while  reducing  its  cost. 

Historical  Perspectives 

It  is  easy  to  understand  why  existing  general-purpose  computers  lack 
sufficient  computing  power  for  three-dimensional  computational  fluid  dynamics 
calculations.  General-purpose  computers  are  based  on  a  von  Neumann  type 
architecture,  in  which  (a)  a  single  processor  is  used,  (b)  a  single, 
separate  memory  is  used,  and  (c)  program  and  data  are  stored  in  the  same 
memory  so  that,  at  least  in  principle,  programs  can  write  other  programs. 
These  principles  were  formulated  at  a  time  when  both  arithmetic  and  logic 
units  (ALU)  and  memory  were  expensive  (hence  only  one  of  each  is  used,  and 
data  and  programs  can  share  a  single  address  space) ,  but  data  communication 
on  internal  bus  lines  was  cheap  (so  that  there  was  little  if  any  penalty  in 
separating  the  processor  and  memory) .  The  large  expense  of  high-performance 
machines  made  it  imperative  that  all  machines  be  general  purpose,  that  is. 
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be  able  to  perform  well  on  a  large  variety  of  algorithms.  However,  these 
von  Neumann  architectures  are  all  limited  by  two  bottlenecks:  the  single¬ 
sequence  bottleneck  and  the  memory  reference  bottleneck. 

The  single-sequence  bottleneck  arises  because  only  a  single  ALU  or 
processor  is  incorporated.  This  restriction  makes  programming  easier 
(indeed  it  has  influenced  the  very  nature  of  virtually  all  widely  used 
computer  languages) ,  but  it  forces  computations  that  could  in  principle 
be  done  simultaneously,  to  be  done  one  after  another.  Heroic  efforts 
have  been  made  to  eliminate  this  bottleneck,  and  techniques  in  common  use 
in  high  performance  computers  include  simultaneous  fetch  of  next  instruc¬ 
tion  from  memory,  and  pipelined  high-speed  arithmetic  operators.  These 
approaches  offer  higher  computation  speed  at  the  expense  of  greater  hard¬ 
ware  cost  and  complexity.  A  few  machines  have  attempted  to  circumvent 
this  bottleneck  by  incorporating  two  or  more  ALU's.  In  most  cases  such 
machines  have  given  disappointing  performance  when  more  them  four  ALU's 
have  been  used,  since  having  n  ALU's  does  not  reduce  computation  time  by 
a  factor  of  n.  This  is  because  techniques  for  unfolding  or  ("vectorizing") 
algorithms  are  only  partially  successful,  and  the  ultimate  computational 
speed  is  determined  by  those  instructions  that  are  not  done  in  parallel. 
Thus  even  a  relatively  innocuous  statement  that,  on  a  single-sequence 
machine,  would  not  take  much  time,  may  end  up  on  a  vector  machine  being 
done  on  one  ALU  while  all  the  others  are  idle.  Perhaps  the  most  successful 
vector  machine  is  the  CRAY-1,  and  in  that  case  much  of  the  success  is  due 
to  the  speed  of  the  individual  ALU's  rather  than  the  parallelism.  If  a 
vectorizing  compiler  is  used,  the  CRAY-1  can  routinely  achieve  30  MFLOPS 
(million  floating-point  operations  per  second) ,  which  is  considerably  below 


its  potential  of  about  150  MFLOPS.  In  order  to  get  maximum  performance 
from  the  CRAY-1,  a  programmer  must  be  aware  of  the  detailed  timing 
assumptions  in  the  architecture,  and  then  program  so  as  to  use  many 
functional  units  concurrently.  He  must  also  use  the  concept  of  chaining 
vector  operations,  and  deal  with  the  fact  that  the  preferred  vector  length 
is  64. 

The  memory  reference  bottleneck  of  von  Neumann  type  computers  arises 
because  a  single  memory  is  used.  Both  instructions  and  data  are  fetched 
from  or  written  to  this  memory.  Typically  the  bandwidth  of  the  memory 
communication  path  is  relatively  small,  either  because  of  slow  memory 
access  time  or  long  distance  between  the  memory  and  the  ALU.  In  addition, 
the  ALU  may  have  to  remain  idle  until  the  memory  fetch  is  complete.  A 
variety  of  successful  techniques  has  reduced  the  impact  of  this  bottleneck 
in  high-performance  computers.  First  of  all,  the  principle  of  space 
locality  is  used  to  predict  what  memory  locations  will  require  access 
next.  In  practical  problems  a  large  percentage  of  memory  accesses  occur 
very  close  to  the  previous  access.  Thus  if  large  segments  of  memory 
(pages)  are  brought  into  relatively  fast  memory,  in  a  large  majority  of 
cases  the  next  memory  location  required  will  already  be  in  "fast"  memory. 

The  techniques  of  cache  memory  and  paging  are  based  on  this  principle 
and  are  very  successful.  In  fact,  practical  computer  systems  have  a  memory 
hierarchy  with  at  least  four  levels — magnetic  tape  (quite  slow,  massive 
size,  used  for  day-to-day  storage);  disk  (faster  but  smaller,  used  for 
most  general  purpose  user-specified  storage);  cache  (much  faster,  quite 
small,  not  usually  under  user  control) ;  and  ALU  registers  (very  fast,  not 
many  of  them,  located  close  by,  used  for  storage  of  data  from  one  instruction 


to  the  next) 


5 


Minicomputer-Peripheral  Arithmetic  Processor  Concept 

The  design  of  a  fluid  dynamics  simulator  of  sufficiently  high  power  is 
quite  difficult  within  the  constraints  imposed  by  these  bottlenecks,  and  a 
common  thread  of  the  unconventional  architecture  concepts  which  have  been 
considered  is  the  attempt  to  avoid  the  von  Neumann  constraints  by  introducing 
parallelism  on  a  modest  or  massive  scale.  In  the  minicomputer-peripheral 
arithmetic  processor  concept,  a  general  purpose  minicomputer  is  used  to 
manage  a  memory  hierarchy  and  data  flow  for  a  special  purpose,  high  capacity 
ALU.  A  typical  minicomputer-peripheral  processor  layout  is  illustrated  in 
Figure  1.  Here  a  general  purpose  minicomputer  with  substantial  main  memory 
and  large  moving  head  disk  drives  supports  the  peripheral  arithmetic  pro¬ 
cessor.  Present  computer  hardware  cost  trends  indicate  that  the  host  mini¬ 
computer  should  be  a  32  bit  supermini  [Digital  Equipment  Corporation  VAX 
11/780  or  Perkin  Elmer  Corporation  3240  for  example]  with  one  to  two  million 
bytes  of  main  memory.  Smaller  and  less  expensive  computers  could  also  serve 
as  host  machines,  but  the  most  cost  effective  choice  appears  to  be  a  32  bit 
supermini . 

The  host  minicomputer  chosen  for  evaluation  was  a  32  bit  minicomputer 
produced  by  Perkin  Elmer,  the  PE  3242.  Important  architecture  features  of 
this  machine  are:  1)  a  high  data  rate  internal  memory  bus,  up  to  10  million 
32  bit  words/second  transfer  rate;  2}  four  DMA  (direct  memory  access)  ports, 
2.5  million  words/sec  transfer  rate  on  each  port;  and  3)  a  cache  memory 
system  well  suited  to  internal  data  shuffle  operations.  The  effective 
memory  access  time  is  500  ns,  the  cache  cycle  time  is  200  ns,  and  the  CPU 
cycle  time  is  260  ns.  Floating  point  hardware  multiply  times  are  approxi¬ 
mately  1  us  for  32  bit  single-precision  results.  A  block  diagram  of  the 
P.E.  3242  is  shown  in  figure  2. 


The  peripheral  arithmetic  processor  chosen  for  evaluation  was  the 
AP-120B  array  processor  produced  by  Floating  Point  Systems,  Inc.  of  Oregon. 
Floating  Point  Systems  is  the  dominant  manufacturer  of  such  peripheral 
processors,  and  the  AP-120B  internal  architecture  is  an  excellent  example 
of  the  potential  computational  advantages  of  abandoning  the  von  Neumann 
style  architecture.  The  AP-120B  consists  of  a  number  of  synchronous, 
parallel  logical  units,  each  operating  under  stored  program  control.  A 
block  diagram  of  the  machine  is  shown  in  Figure  3. 

The  parallel  structure  of  the  AP-120B  allows  the  overhead  of  array 
indexing,  loop  counting  and  data  fetching  from  memory  to  be  performed 
simultaneously  with  arithmetic  operations  on  the  data.  Stored  programs 
and  data  each  reside  in  separate,  independently  addressable  memories  to 
reduce  memory  accessing  conflicts.  Independent  floating  point  multiply  and 
adder  pipelines  both  allow  operations  to  be  initialed  every  machine  clock 
cycle  or  167  ns.  Address  indexing  and  counting  functions  are  performed  by 
an  independent  integer  arithmetic  unit.  For  certain  computations,  such  as 
a  Fast  Fourier  Transform,  the  computation  rate  is  near  that  of  a  floating 
point  multiply  and  add  result  every  clock  cycle  or  12  MFLOPS  (millions  of 
floating  point  operations  per  second) .  The  floating,  point  data  word  is 
38  bits  long. 

The  AP-120B  is  connected  to  a  PE3242  DMA  port  and  this  processor  con¬ 
tains  2048  (2K)  words  of  program  storage  memory  and  32K  words  of  333  ns 
cycle  time  main  data  memory.  The  AP-120B  cannot  be  considered  a  general 
purpose  scalar  or  vector  arithmetic  processor.  Achieving  good  performance 
requires  careful,  custom  hand  coding  of  critical  code  sections,  and  the 
limited  function  unit  parallelism  constrains  the  type  of  algorithm  that 
can  effectively  use  the  machine.  For  example,  a  memory  read  instruction 
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may  be  initiated  only  every  third  machine  cycle  so  that  true  vector  operations 
like  those  available  on  the  CDC  STAR-205  are  not  possible.  References  to 
the  floating  point  register  file  (data  pad  X  and  Y)  are  not  sufficiently 
flexible  to  overcome  this  difficulty.  To  optimize  the  operation  of  the 
AR-120B,  it  is  necessary  for  the  programmer  to  "look  ahead"  and  initiate 
memory  reads  prior  to  the  actual  time  values  from  data  memory  are  to  be 
used  in  a  calculation.  The  burden  on  the  programmer  to  be  cognizant  of 
machine  architecture  is  not  greatly  different  from  that  of  programmers  on 
the  CRAY  or  CDC  STAR  computer. 

Numerical  Solution  Scheme 

In  order  to  properly  focus  the  AP-12*0B  architecture  studies,  the 
system  was  evaluated  for  use  with  a  demanding  algorithm,  in  terms  of 
memory  size  and  complexity,  which  might  be  suitable  for  solving  the  three- 
dimensional,  unsteady  Reynolds  averaged  Navier-Stokes  equations.  The 
numerical  method  is  one  due  to  Beam  and  warming  [1]  of  NASA  Ames  Research 
Center  and  may  be  generally  described  as  an  approximate  factorization 
scheme  for  vector  sets  of  convection-diffusion  equations.  This  scheme 
may  be  represented  as: 


(I  +  L  ) (I  +  L  ) (I  +  L  )AUn  -  RHSn  (1) 

x  y  z 

In  this  representation,  I  is  the  identity  matrix,  L  ,  L  ,  L  are 

x  y  z 

linear,  finite  difference  operators  in  the  x-y-z  coordinate  directions, 

AU*  is  the  change  in  solution  state  vector  from  time  step  n  to  time  step 
n+1,  and  RHS°  is,  in  effect,  the  steady-state  solution  to  the  Navier-Stokes 


equations  at  time  level  n.  The  computational  solution  proceeds  as: 
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(I 

+  L  )AU* 
x 

= 

RHS" 

Step  1 

(2) 

(I 

+  L  )AU** 
y 

Au* 

Step  2 

(3) 

(I 

+  L  )AUn 
z 

- 

.  ★  * 

AU 

Step  3 

(4) 

un+1 

3 

Un  +  AU11 

Step  4 

(5) 

where  steps 

1 ,  2 ,  and 

3 

involve  calculation 

of  parameters,  evaluation 

of 

the  RHS,  and  solution 

of 

a  large  set  of  sparse,  simultaneous,  linear 

equations.  The  equation  set  to  be  solved  is  block  tridiagonal  and  is 
solved  by  LU  decomposition.  The  size  of  I  and  other  submatrices  appearing 
is  (mxm)  where  m  is  the  number  of  vector  equations  to  be  solved,  m»  5  for 
ordinary  problems.  The  number  of  simultaneous  equations  is  m  times  the 
number  of  finite  different  nodes  along  a  physical  direction.  Details  of 
the  finite  difference  algorithm  are  presented  in  reference  [2] . 

The  characteristic  operation  of  this  scheme  is  the  solution  of  a  large 
set  of  simultaneous  equations,  shown  below  in  their  banded  matrix  form. 


Ao 

Bo 

co 

0 

0 

•  •  • 

0 

Auo 

Do 

A1 

B1 

C1 

0 

0 

•  •  • 

0 

AU1 

• 

D1 

0 

A2 

B2 

C2 

0 

•  •  • 

0 

AU2 

. 

• 

°2 

e 

• 

0 

• 

• 

0 

# 

• 

0 

^J-2 

°J-2 

• 

• 

e 

e 

• 

e 

AJ-1 

BJ-1 

CJ-1 

AUJ-1 

DJ-1 

0 

0 

•  e  # 

AJ 

BJ 

CJ 

AUJ 

Here  the  block  matrix  elements,  ky  B ^ ,  are  themselves  mxm  whose 
elements  depend  only  on  the  solution  at  time  level  n.  D.  is  a  vector  of 
length  m  that  depends  either  on  the  solution  at  time  level 'n  or  is  known 
from  the  previous  approximation  step. 


9 


When  using  a  von  Neumann  style  computer  having  a  single  ALU  and  a 
limited  amount  of  local  register  storage,  the  sequence  of  operations  to 
find  a  solution  is  immaterial,  and  it  may  be  found  by: 

1)  Computing  the  RHS  function  of  equation  (2)  at  all  node  points 
and  storing  in  main  memory. 

2)  Solving  the  equation  system  (2) ,  for  each  x  coordinate  line. 
This  operation  requires  that  matrix  coefficients  Aj ,  B  and  Cj  be  calculated 
and  a  matrix  equation  like  equation  (6)  be  solved.  The  equation  system  is 
solved  in  two  sweeps  by  LU  decomposition.  The  forward  sweep  appears  as: 


ej  -  =;1  <-v 


-1 


fj  *  Gj  lD)  * 
S3  ’  CjEj-l  +  Ai 

and  the  backward  sweep  would  appear  as: 


Au.  -  E.U.  ,  +  F. 
3  33+1  3 


(7) 

(8) 
(9) 

(10) 


-1 


where  G  indicates  the  inverse  of  G  j .  Of  course,  for  numerical  calcu¬ 
lations  on  a  conventional  computer,  G.  ^  should  never  be  found;  rather, 
the  elements  of  Ej  are  solved  for  directly.  With  certain  types  of  computer 
hardware ,  it  may  be  advantageous  to  compute  Gj  ^  . 

3)  Solving  the  matrix  equations  (3)  for  each  y  coordinate  grid 
line  as  in  step  2. 

4)  Solving  the  matrix  equations  (4)  for  each  z  coordinate  grid 
line  as  in  step  3  and  updating  the  solution  from  time  level  n  to  time  level 


n  +  1. 


A  reference  problem  with  a  finite  difference  grid  of  50  x  50  x  100  will 


be  considered.  At  each  grid  node  about  20  floating  point  words  must  be 


stored  for  a  problem  data  storage  requirement  of  5.0  million  words.  For 


purposes  of  total  timing  estimates,  it  was  assumed  that  500  iterations  or 
time  steps  would  be  required  to  compute  a  steady  state  solution. 

It  is  felt  that  the  solution  time  per  iteration  for  the  Beam  and 
Warming  algorithm  is  typical  of  many  other  algorithms  and  that  the  block 
tridiagonal  matrix  structure  is  typical  of  many  implicit  solution  schemes. 
Thus  solution  times  for  this  algorithm  should  be  a  good  guide  for  hardware 
performance  on  other  algorithms. 

Two  Dimension  Test  Computation  -  Host  Computer  Only 

The  Beam  and  Warming  algorithm  described  in  the  previous  section  was 
first  tested  on  the  host  minicomputer  in  a  FORTRAN  only  version.  For  the 
two-dimensional  test  solution,  the  finite  difference  operator  sequence 
appears  as: 


(I  +  Lx)AU* 

=  HHSn 

Step  1 

(11) 

(I  +  Ly)  Au° 

=  AU* 

Step  2 

(12) 

un+1 

=  un  +  Aun 

(13) 

The  performance  of 

the  present  FORTRAN  code 

is  illustrated  in 

figure 

for  a  finite  difference  grid  line  of  length  50  nodes.  Such  grid  line  re¬ 
quires  1.65  ms/node  to  compute  the  RHS  function,  0.632  ms/node  to  compute 
the  matrix  coefficient  and  2.17  ms/node  to  solve  the  matrix  equation.  A 
test  problem  of  size  50  x  100  nodes  would  require  36  seconds  per  time  step 
and  5  hours  for  500  time  steps. 

Array  Processor  Test  Calculation 
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is  illustrated  in  figure  5.  The  RHS  function  still  requires  1.65  ms/node 
to  compute,  since  it  was  left  in  the  host  computer,  but  the  matrix  solution 


now  requires  only  0.34  ms/node  rather  than  2.80  ms.  This  result  is  a 
speed-up  factor  of  approximately  8.2.  An  overhead  factor  of  approximately 
6  ms/line  is  incurred.  With  the  RHS  function  calculated  in  the  host  computer, 
a  time  step  now  requires  20  seconds,  and  500  time  steps  requires  2.7  hours. 

A  total  speed-up  factor  of  3.2  was  achieved  with  just  the  coefficient  and 
matrix  equation  solution  done  in  the  array  processor.  When  the  RHS  cal¬ 
culation  is  moved  into  the  AP-120B,  it  is  estimated  that  a  time  step  will 
require  5.46  seconds,  and  500  time  steps  will  require  45  minutes. 

AP-120B  Architecture  Considerations 


The  first  test  calculations  demonstrated  both  the  advantages  and  the 
restrictions  of  the  AP-120B  architecture.  These  limitations  are  best 
explained  by  considering  a  vector  operation  which  occurs  often  in  CFD 


simulations : 

a.  n+1  n  /v  n  ^  n 

U.  -  Oj  +  a  (U "+1  -  Uj_1)  ;  j  «  1  to  J  (14) 

A  vector  of  length  J  is  updated  using  a  central  difference  operator  multi¬ 


plied  by  a  variable  coefficient.  Figure  6  shows  how  this  operation  could 
be  carried  out  using  multiple,  chained  pipelines  and  multiple  register 


files.  With  such  a  configuration  one  final  result,  U_.  requiring  2  add 
operations  and  1  multiply  operation,  can  be  obtained  each  machine  clock 
cycle.  The  resulting  computation  rate  is  18  MFLOPS  (millions  of  floating 
point  operations  per  second) . 

The  AP-120B  has  only  a  single  add  unit  and  a  single  multiply  unit  and 
only  a  limited  amount  of  register  file  storage,  64  words,  so  that  chained 
pipeline  operation  is  not  possible.  The  best  computation  rate  that  can  be 
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maintained  on  the  example,  equation  (10) ,  is  3  multiply  operations  and  6 
add  operations  every  6  clock  cycles.  This  computation  rate  is  9  MFLOPS. 

The  major  architecture  constraint  of  the  AP-120B  is  the  main  data 
memory  access  time.  Main  data  memory,  or  table  RAM,  cannot  be  simultaneously 
read  from  and  written  to,  and  only  a  single  word  can  be  read  or  written  per 
clock  cycle.  The  example  algorithm  requires  3  memory  references  per  final 
result.  For  standard  speed  memory  (333  ns  cycle  time)  a  memory  reference 
can  be  initiated  only  every  third  clock  cycle  so  that  the  example  calculation 
is  memory  reference  limited.  Each  final  result  requires  9  clock  cycles  for 
a  computation  rate  of  2  MFLOPS. 

In  principle,  the  use  of  interleaved,  fast  memory  (167  ns  cycle  time) 


would  allow  a  9  MFLOP  computation  rate  to  be  maintained,  but  each  memory 

reference  address  must  be  generated  through  S-Pad  operations  rather  than 

through  hardware.  In  practice,  it  appears  that  maximum  computation  rate 

that  can  be  maintained  is  about  4  MFLOPS  for  general  algorithms.  Such  a 

sustained  computation  rate  is  comparable  to  best  CDC7600  speeds  and  is 

impressive  for  a  machine  whose  hardware  cost  is  around  $60,000. 

An  interesting  result  of  the  memory  reference  limitation  is  that  the 

sequence  of  operation  in  the  block  LU  decomposition,'  equations  (7)  through 

(10),  can  be  critical.  In  this  sequence,  the  number  of  temporary  results, 

2 

A,  for  example,  scales  as  m  ,  where  m  is  the  number  of  conservation  equations. 
If  these  coefficients,  once  calculated,  were  written  to  main  data  memory, 
the  memory  reference  limitations  would  be  much  mors  important.  The  inter¬ 
mediate  results,  E.  and  F.,  are  needed  later  in  the  backward  sweep, 

3  3 

equation  (10),  and  should  be  written  to  table  RAM.  In  more  general  terms, 
effective  use  of  the  AP-120B  requires  the  programmer  to  identify  and  exploit 


showed  that  a  large  amount  of  AP-120B  main  data  memory  is  not  required  to 
sustain  calculations  at  the  maximum  plausible  rates.  Referring  to  figure  5, 
the  overhead  time  needed  to  transfer  data  to  and  from  the  AP-120B  is  small 


and  can  be  made  negligible.  The  overhead  results  mainly  from  a  poor  Perkin- 
Elmer  Floating  Point  Systems  interface  (hardware  and  software) ,  and  it 
appears  that  this  overhead  time  can  be  reduced  to  less  than  5%  of  computation 
time  for  problems  of  modest  size. 

If  the  AP-120B  can  work  effectively  on  small  amounts  of  data,  problem 
data  matrix  storage  can  be  maintained  on  a  memory  hierarchy  rather  than  in 
the  AP-120B  main  data  memory.  The  hierarchy  considered  is  to  maintain  the 
full  data  matrix  (data  matrix  is  of  4  to  40  million  floating  point  words) 
on  a  mass  storage  device  at  all  times;  to  maintain  small  sections  of  the 
data  matrix,  of  order  150,000  floating  point  words,  in  the  host  computer 
memory  for  algorithm  processing,  and  to  transfer  only  a  few  thousand  words 
to  the  AP-120B  data  memory  for  detailed  computational  tasks.  For  low  cost 
systems,  the  mass  storage  device  would  be  a  moving  head  disk,  while,  for 
higher  performance  systems,  the  storage  device  would  be  a  semi-conductor 
mass  memory  system. 

For  either  mass  storage  device,  segmentation  of  the  problem  matrix  on 
the  memory  system  is  an  important  consideration;  the  segmentation  considered 
is  shown  in  figure  7.  It  is  not  known  yet  if  this  is  the  optimum  segmentation. 
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but  it  is  felt  to  be  near  optimum  for  small  main  memory  size.  The  data 
base  is  arranged  such  that  the  smallest  memory  system  addressable  element 
is  a  (5x5x5)  node  block  containing  5x5x5x20  floating  point  words. 

The  smallest  working  set  to  be  in  host  computer  memory  is  three  axial  data 
planes  composed  of  300  (5  x  5  x  5)  sub-blocks. 

When  an  axial  data  plane  set  is  in  memory,  the  first  two  operators 
may  be  executed  on  one  axial  plane: 

(I  +  L^AU*  =  RHSn  (15) 

(I  +  Ly)AU**  =  U  (16) 

This  axial  plane  must  then  be  returned  to  the  memory  system  and  a  new  plane 
read  in.  When  100  axial  planes  have  been  processed,  an  axial  grouping  of 
20  •  (5x5x5)  blocks  is  read  in  and  the  last  operator  executed: 


(I  +  L  )AUn  =  U** 

Z 

(17) 

u"+1  .  u°  +  iua 

(18) 

One  full  iteration  is  completed  when  100  sets  of  this  last  axial  grouping 
have  been  assembled  and  processed. 

This  data  structure  implies  a  large  quantity  of  data  movement  per 
solution  time  step  since  each  axial  plane  must  be  moved  from  the  memory 
system  to  the  host  computer  memory  to  the  AP-120B  back  to  the  host  computer 
memory  and  back  to  the  memory  system.  This  process  must  also  be  repeated 
for  each  time  step,  which  means  that  for  the  reference  grid  32  MWords  must 
be  moved  for  each  solution  time  step.  The  host  computer  selected  for 
evaluation,  the  Perkin  Elmer  3242,  is  probably  unique  in  a  commercially 
available  minicomputer  in  that  its  internal  data  base  structure  is  capable 
of  handling  this  data  quantity  at  high  speed.  Under  worst  case  conditions. 
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the  data  bus  structure  can  support  an  aggregate  data  rate  of  10  MWords 
per  second.  Thus  the  entire  data  shuffle  operation  need  require  only  4 
seconds  real  time  if  the  AP-120B  and  the  memory  system  can  supply  and 
accept  data  at  an  adequate  rate. 

The  AP-120B  can  perform  block  data  transfers  at  the  rate  of  1  word 
each  clock  cycle  or  6  MWords  per  second.  However,  it  is  expected  that 
only  a  small  amount  of  data  will  be  transferred  to  or  from  the  AP-120B 
at  any  one  time,  and  the  I/O  time  will  be  dominated  by  the  overhead  time 
to  initiate  a  transfer  operation.  This  overhead  time  is  estimated  to  be 
about  2  ms  per  transfer  for  a  total  overhead  time  of  25  seconds  per  time 
step  on  the  50  x  50  x  100  reference  grid. 

As  can  be  seen  from  these  simple  considerations,  the  overhead  time 
in  setting  up  the  AP-120B  data  transfers  is  5  times  larger  than  the  host 
data  transfer  time.  However,  the  overhead  time  does  imply  a  minimum  vector 
length,  which  does  depend  on  the  numerical  algorithm  complexity,  below 
which  no  gain  is  made  by  using  the  AP.  For  the  test  algorithm,  equation 
(14)  and  a  2  ms  overhead,  a  minimum  vector  length  of  about  300  is  required 
if  the  host  computer  can  perform  floating  point  operations  in  an  average 
time  of  2  us.  A  2  us  average  computation  time  is  quite  generous  for  the 
host  computer  considered. 

This  type  of  simple  operation  count,  however,  seriously  overestimates 
the  importance  of  AP-120B  overhead  since  computational  fluid  dynamics 
problems  are  characterized  by  the  fact  that  coefficient  calculations,  cu 
in  equation  (14)  ,  are  complex  algebraic  functions  of  the  current  solution 
vector.  For  example,  it  appears  that  the  minimum  time  to  perform  the  block 
tridiagonal  inversion  is  about  0.2  ms  per  node  for  the  two-dimensional  case. 
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For  the  overhead  to  be  10%  of  the  calculation  time  for  just  the  block  in¬ 
version  requires  a  vector  length  of  100.  If  the  calculation  time  for  the 

coefficients  A.,  B.,  C.  is  also  considered,  the  vector  length  for  10%  over- 
3  3  3 

head  time  falls  to  50,  and  it  falls  to  25  if  the  RHS  calculation  is  con¬ 
sidered.  For  three-dimensional  calculations,  the  minimum  vector  length  for 
10%  overhead  is  as  low  as  5  or  10.  It  is  also  to  be  remembered  that  the 
10%  overhead  time  is  about  1%  of  the  time  required  to  do  the  calculation 
on  the  host  computer. 

The  introduction  of  64K  RAM  VLSI  memory  chips  has  made  the  concept  of 
large,  fast,  inexpensive  backup  memory  systems  very  attractive.  Several 
vendors  now  supply  memory  systems  made  up  from  these  chips  complete  with 
power  supplies,  error  correction  and  custom  interfacing  designed  as  moving 
or  fixed  head  disk  replacements.  A  particularly  interesting  system  is  the 
memory  system  3000  sold  by  Motorola  Inc.  which  contains  a  maximum  of  32M 
Bytes  in  a  single  chassis.  A  block  diagram  of  this  system  is  shown  in 
figure  8. 

The  64K  RAM  chips  are  arranged  on  16  memory  cards  which  are  individually 
connected  to  an  internal  memory  bus.  The  memory  address  controlled  (ACC) 
has  parallel  access  to  all  16  memory  cards,  thus  making  available  on  the 
user  bus  16  72-bit  words  each  500  ns  cycle.  In  block  transfer  mode,  this 
structure  results  in  a  64M  Bytes  per  second  transfer  rate.  In  random  access 
mode,  the  data  transfer  rate  is  16M  Bytes  per  second.  The  maximum  transfer 
of  the  host  computer  interface  channel  is  10M  Bytes  per  second  so  that  the 
Motorola  memory  system  can  supply  data  in  the  random  access  mode  faster 
than  the  host  computer  can  accept  it. 

An  overhead  or  setup  time  will  be  required  to  initiate  a  transfer  data 
between  the  memory  system  and  the  host  computer,  but  the  total  overhead 


time  should  be  small  compared  to  the  AP-120B  overhead  time.  Memory  system 


read/write  requests  will  be  initiated  for  fewer  times  than  I/O  setups  for 
the  AP  since  the  memory  system  will  be  used  to  store  the  data  blocks  illus¬ 
trated  in  figure  6.  These  blocks  are  broken  down  to  individual  grid  lines 
to  transfer  to  the  AP. 

Hardware  Cost  Estimate 

The  hardware  costs  for  the  host  computer  with  2M  Bytes  main  memory,  a 
300M  Byte  moving  head  disk  and  DMAI  interfaces  is  $200,000.  The  hardware 
cost  for  AP-120B  with  32K  words  main  data  memory  (333  ns  cycle  time)  and 
P.E.  3242  interface  is  $65,000.  The  Motorola  memory  system  will  cost 
approximately  $90,000  for  16M  Bytes  storage  and  perhaps  $20,000  for  the 
custom  interface  required.  The  total  hardware  cost  for  a  minimal  system 
is  $375,000. 

Price — Performance  Estimate  for  Reference  Grid 

The  two-dimensional  test  codes  allow  a  rough  estimate  of  the  system 
performance  on  the  reference  grid  problem.  The  AP-120B  calculation  time 
without  consideration  for  overhead  time  is  estimated  to  be  0.4  ms  per  mode, 
based  on  the  two-dimensional  simulations,  which  yields  a  calculation  time 
of  41  hours.  Assuming  a  10%  overhead  time  gives  a  total  computation  time 
of  45  hours.  When  the  two-dimensional  test  codes  were  constructed,  the 
memory  reference  limitations  of  the  AP-120B  were  not  properly  understood, 
and  it  is  estimated  that  the  calculation  time  can  be  reduced  by  a  factor 
of  two.  This  saving  results  from  not  storing  the  coefficients,  A^ ,  B ^ 
in  equations  (6)  through  (10)  in  main  data  memory.  The  estimated  computation 
time  on  the  reference  grid  is  then  23  hours.  The  average  computation  rate 
is  approximately  3  MFLOPS  with  a  total  hardware  cost  of  $265,000. 
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A  minimum  cost  system  would  replace  the  semiconductor  memory  system 
with  the  moving  head  disk  which  increases  the  total  computation  time  by 
introducing  the  average  disk  seek  time  and  slower  data  transfer  rate  into 
the  calculation.  This  overhead  time  is  roughly  estimated  to  24  hours  and 
is  not  easily  overlapped  with  AP-120B  calculation  time.  A  total  computation 
of  47  hours  is  estimated  when  storing  the  solution  data  matrix  on  disk. 

A  maximum  performance  system  would  use  the  P.E.  3242  base  structure 
to  drive  4  AP-120B  units.  Since  all  calculations  along  grid  lines  can  be 
made  independent,  the  total  computation  time  can  be  reduced  by  a  factor  of 
four.  The  total  computation  time  would  then  be  about  6  hours.  The  average 
computation  rate  would  be  about  12  MFLOPS  with  a  total  hardware  cost  of 
about  $570,000.  The  spectrum  of  price-performance  is  illustrated  in 
figure  9. 

Analysis  and  Discussion 

One  major  factor  leading  to  exceptional  performance  of  the  minicomputer- 
array  processor  combination  is  a  good  match  between  operating  speeds  of 
different  system  devices  when  the  mass  memory  system  uses  the  64K  bit 
MOS  memory  chips.  The  critical  weakness  of  present  supercomputers  is  in  main 
data  memory  costs  and  communication  strategies.  In  these  machines  main 
memory  is  made  up  of  fast  but  expensive  ECL  (emitted  coupled  logic)  chips 
which  still  require  complex  communication  strategies  to  generate  enough 
memory  access  bandwidth  to  keep  pace  with  arithmetic  processing.  At  the 
reduced  pipeline  speeds  of  the  AP-120B,  low  cost  MOS  memory  chips  with 
simple  communication  strategies  easily  provide  the  needed  memory  access 
bandwidth.  The  balanced  minicomputer  system  cost  is  reduced  to  the  point 
that  dedicated  systems  can  provide  larger  memory  sizes  than  available 


supercomputers 
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BOUNDARY  TREATMENTS  FOR  IMPLICIT  SOLUTIONS  TO  EULER  AND  NAVIER-STOKES  EQUATIONS 

Summary 

The  importance  of  boundary  treatments  for  implicit  algorithms  was 
greatly  underappreciated  until  recent  work  by  Yee,  Beam  and  Warming  [4] 
appeared.  This  work  used  a  modal  stability  analysis  originated  by  Kriess 
[5]  to  analyze  the  effect  of  several  different  boundary  treatments  on 
algorithm  stability.  This  theory  strictly  applied  only  to  linear  equations 
with  constant  coefficient  in  one  space  dimension,  and  a  computational  study 
was  conducted  to  test  its  relevance  to  realistic  Euler  or  Navier-Stokes 
computations.  It  was  found  that  for  both  explicit  and  implicit  boundary 
treatments,  it  was  possible  to  compute  solutions  with  time  steps  50  to 
100  times  explicit  time  limits  while  retaining  the  ability  to  choose  rather 
arbitrary  initial  conditions.  An  even  more  important  computation  result 
which  was  observed  is  that  while  large  time  step  sizes  may  be  used,  the 
largest  convergence  rates  occur  at  relatively  small  time  step  sizes.  For 
the  two-dimensional  test  cases  considered,  the  best  time  steps  were  of 
order  10  times  the  explicit  limits.  Since  the  present  implicit  codes 
require  more  than  10  times  the  operations  per  time  step,  the  convergence 
rates  of  the  implicit  codes  must  be  improved  before  they  earn  be  considered 
superior  to  the  explicit  algorithms. 


BOUNDARY  TREATMENTS  FOR  IMPLICIT  SOLUTIONS 
TO  EULER  AND  NAVIER-STOKES  EQUATIONS* 

W.  T.  Thompkins,  Jr. 

R.  H.  Bush 

Massachusetts  Institute  of  Technology 

INTRODUCTION 

Implicit  time  marching  schemes  like  those  of  Beam  and  Warming  [1] , 
Briley  and  McDonald  [2]  ,  and  MacCormack  (1980)  [3_]  generally  have  not  been 

as  robust  as  would  be  expected  from  a  stability  analysis  for  the  pure  initial 
value  problem.  Recently,  Yee,  Beam  and  Warning  [4_]  illustrated  that  a  more 
general  stability  analysis,  which  includes  the  effect  of  boundary  conditions, 
may  explain  some  of  the  seemingly  anomalous  behavior  of  these  schemes.  The 
major  theoretical  basis  for  this  type  of  modal  stability  analysis  was 
established  in  a  series  of  papers  by  Kriess  [5_,6J  ,  Osher  [7_,8_]  and 
Gustafsson  et  al  [9_]  . 

Yee  as  well  as  Gustafsson  and  Oliger  [10]  considered  the  effect  of 
inflow/outflow  boundary  condition  formulations  on  the  stability  of  a  class 
of  numerical  schemes  to  solve  the  Euler  equations  in  one-space  dimension. 

The  characteristic  feature  of  a  subsonic  inflow/outflow  boundary  is  that 
a  priori  boundary  values  may  be  specified  for  only  some  problem  variables 
while  remaining  boundary  values  must  be  determined  as  part  of  the  solution 
process.  Yee  demonstrated  a  rather  large  disparity  in  stability  bounds 
between  the  use  of  explicit  or  implicit  extrapolation  procedures  and  in 
general  demonstrated  that  implicit  extrapolation  procedures  had  the  least 
restrictive  stability  bounds.  The  intent  of  this  work  is  to  explore  com¬ 
putationally  the  implication  of  this  work  for  several  two-dimensional  Euler 
and  Navier-Stokes  simulations. 
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The  strong  conservation  law  form  may  be  retained  under  a  general 
coordinate  mapping  as  illustrated  in  Viviand  [11] .  All  computations  to 
be  described  were  conducted  in  a  mapped  computational  domain  but  numerical 
and  boundary  condition  procedures  will  be  described  in  the  simple  two- 
dimensional  geometry  shown  in  figure  1  for  simplicity. 

A  1979  paper  by  Beam  and  Warming  [12]  outlined  a  solution  scheme 
for  systems  of  equations  of  the  form  (1)  which  includes  most  numerical 
schemes  for  which  the  modal  boundary  condition  analysis  has  been  conducted. 
This  scheme  uses  the  well  developed  methods  for  ordinary  differential 
equations  as  a  guide  to  developing  numerical  methods  for  partial  differential 
equations.  The  scheme  presented  combines  Linear  Multistep  Methods,  local 
linearization,  approximate  factorization  and  One  Leg  methods.  The  scheme, 
a  generalization  of  the  scheme  presented  in  reference  [1_] ,  solves  for  a 
variable  p(E)u  which  is  equivalent  to  Aun  in  the  class  of  schemes  represented 
by  the  earlier  paper.  The  earlier  scheme  is  somewhat  easier  to  understand 
as  Au11  is  just  the  change  in  the  solution  from  time  level  n  to  level  n  +  1, 
while  p(E)u  is  a  more  general  time  differencing  formula. 

The  solution  schemes  chosen  are  implemented  as: 


(I  +  L  n  ) AU* 

=  RHSn 

(2) 

y 

(I  +  L  n  )Aun 

*  AU* 

(3) 

X 

„n+l  n 

.  n 

U  =  U 

+  Au 

(4) 

where 

RHSn  is  very  nearly  the  finite  difference  approximation  to  the  steady 
state  equations,  and 

L  and  L  are  linearized  finite  difference  operators  representing  a 
x  y 

particular  time  and  spatial  differencing  scheme. 


Full  details  of  these  operators  are  contained  in  Beam  and  Warming 


[1_] .  If  the  spatial  differencing  is  taken  to  be  centered,  the  computational 
form  of  either  equation  (2)  or  (3)  appears  at  each  interior  point  as: 


A  ,n  Au  n  +  BnAU.n  +  cnAun.  =  D.n 
r  l-l  li  l  i+l  l 


(5! 


where  A.,  B.,  and  C.  are  4x4  matrices,  known  at  time  level  n, 

ill 


D.  is 

l 


the  right-hand  side  vector  at  node  point  i  known  at  time  level n,  and  Au. 


is  the  unknown  vector  at  node  point  i.  The  boundary  points  will  be  assumed 
to  involve  only  the  nearest  two  points  in  the  x  direction. 


AC>0  +  *0^1  +  C0tU2  =  D0n 


(6: 


The  restriction  to  extrapolation  along  grid  lines,  actually  transformed 

grid  lines,  is  necessary  to  maintain  the  block  tridiagonal  form 

and  avoids  possible  instabilities  due  to  skewed  extrapolation, 

see  Abarbanel  and  Murman  [13_]  .  Extrapolation  procedures 

using  more  than  the  two  nearest  neighbors  can  also  be  included  in  the 

process  to  be  described. 


The 
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(7) 


th 

and  will  reduce  to  tridiagonal  form  if  the  first  and  n  equations  are  sub- 

th 

stituted  into  the  second  and  the  n-1  equations. 
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B*  o'  0  ...  0 

Au 

D,' 

1  1 

1 

1 

A-  B„  C.  ...  0 

Au 

D_ 

2  2  2 

2 

2 

0 

• 

• 

• 

0 

= 

• 

I  A  _  B  -  C  _ 

AU  _ 

D  _ 

n-2  n-2  n-2 

n-2 

n-2 

0  ...  0  A  '  ,  B  ' 
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n-1  n-1 
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n-1 

where  for  example 

Bl  ■  Bi  -  AlAo 1 B0  • 


(8) 


This  cumbersome  development  allows  us  to  show  clearly  how  a  large  variety 
of  explicit  or  implicit  boundary  forms  can  be  included  without  difficulty. 


BOUNDARY  TREATMENTS 
Inflow/Outflow  Boundary 

The  finite  difference  algorithms  studied  usually  require  more 
boundary  values  than  are  required  for  the  partial  differential  equations 
which  they  simulate.  These  extra  numerical  boundary  conditions  cannot  be 
set  arbitrarily  and  are  usually  determined  through  an  extrapolation  pro¬ 
cedure.  These  extrapolation  procedures  may  either  be  explicit,  that  is 
boundary  values  needed  at  a  new  time  are  determined  uniquely  from  the  old 
time  level  solution,  or  implicit,  that  is  boundary  values  are  determined 
as  part  of  the  new  time  level  solution.  The  analytical  boundary  conditions 
or  the  extrapolation  quantities  are  usually  not  conservation  variables  but 
primitive  variables  and  a  local  linearization  is  usually  required  as  part 
of  defining  the  extrapolation  procedure. 
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Consider,  for  example,  an  implicit,  subsonic  outflow  boundary  at 
which  the  local  static  pressure  is  specified  as  a  boundary  condition,  and 
all  other  variables  are  to  be  determined  by  extrapolation.  Figure  1  shows 
a  typical  computational  grid  and  defines  the  subscripts  used. 


„n+l  _ n 

P .  .  =  P .  .  ;  given 

1/3 


(9) 
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PU 
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,0V. 
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pvj 

i/ j-1 
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n+1 


implicit 

space 

extrapolation 


(10) 


i/ j-2 

In  order  to  complete  the  boundary  formulation,  all  equations  must  be  ex¬ 
pressed  in  delta  form  and  in  terms  of  conservation  variables.  For  the 
total  internal, energy  this  may  be  done  through  its  definition: 


Et  = 


_P_  +  i 

(Y-l)  2 


(Pu)2  +  (Pv)2^ 


(11) 


P  P 

Since  the  relations  between  conservation  variables  are  nonlinear,  some 
linearization  step  will  be  necessary  before  the  boundary  condition  formulation 
may  be  used.  We  choose  to  introduce  our  linearization  step  here  as: 

n 


AEt  =  fEtn+1  -  E”  )  =  AP  -  |  (u2  +  v2)  Ao  +  unA(pu) 


+  vnA(pv)  +  (AuAv,  Au2,  Av2,  ApAu,  ApAv) 


(12) 


If  terms  of  order  AuAv  are  neglected,  the  error  is  equivalent  to  the 
linearization  error  of  the  interior  point  scheme.  We  may  express  the 
transformation  from  boundary  variables  to  conservation  variables  as: 
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(13) 


We  will  in  general  denote  transformation  from  conservative  to  primitive 


variables  as 


AW.  .  =  T.  .AU.  . 

i,l  i,l 


The  extrapolation  conditions  for  AW^  ^  ares 


W.  _ 
i,J 


’  Ap' 

= 

Apu 

Apv 

.  AP. 

i,J 

0  2  0  0  Apu 
0  0  2  0  Apv 


+  f-1  0  0  01 [Ap 

0-100  Apu 
0  0-10  Apv 


0  0  0  0j[  APJ.^ 


o  o  o  oHap 


w.  ,  =  P,  ,  W.  ,  ,  +  p,  „  w.  ,  „ 
i, J  J— 1  i,  J-1  J— 2  i,  J-2 

The  final  equations  relating  the  boundary  conservation  variables  and  the 
interior  conservation  variables  are: 


AU.  T  =  Nn  ,  PT  .Tn  _  ,AU.  ,  +  P,  0T.n  ,  ,AU.  „ 

IftJ  XfU~X  lfu“X  IfU  — ^ 


AU.  _  =  G.n  _  , AU .  _  .  +  Hn  ,  AU.  _  , 
i , J  i, J-1  i/J— 1  i, J-2  i, J-2 

With  the  definition  of  P  .  and  P_  _  given  in  equation  (15) ,  T.  and 

J— X  J— 2  X,*J— X 

T.  ,  .  are  identity  matrices, 
l ,  J-2 

An  explicit  outflow  boundary  treatment  was  constructed  using: 

_n+l  _n 
P  =  P  ;  given 


tpV-»i,J  ^i,J-l 
and  setting  G,  =H,  __=0, 
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In  forming  equation  (10) ,  we  choose  to  extrapolate  the  local  momentum 

flux  rather  than  a  specific  primitive  or  characteristic  variable;  choice  of 

other  extrapolation  variables  would  alter  only  the  transformation  matrix, 

T.  ..  Extrapolation  of  the  momentum  flux  is  somewhat  arbitrary,  but  its 

1 , 3 

choice  did  not  affect  the  accuracy  of  the  computational  results  to  be 
presented. 


Solid  Wall  Boundary  Procedures 

The  boundary  treatment  procedures  illustrated  for  inflow/outflow 
boundary  are  easily  extended  to  cover  solid  walls  in  either  inviscid  or 
viscous  flow  situations.  Here, 
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(20) 


or 


N  . AW_  . 
OfJ  0,3 


(21) 


where  q  is  the  velocity  parallel  to  the  wall  and  S  is  the  wall  slope.  For 

the  inviscid  flow  examples  3P/3y,  3T/3y  and  3u/3y  are  set  equal  to  zero, 

while,  for  the  viscous  flow  examples  v,  u,  and  3T/3y .are- set  equal  to  zero 

2 

and  3P/3y  is  equal  to  4/3  y(3/3y  ) (v) .  All  derivatives  are  evaluated  by 
one-sided  finite  difference  formulas. 

As  indicated  by  Buggeln,  Briley  and  McDonald  [14] ,  an  ADI  type  pro¬ 
cedure  requires  boundary  conditions  for  the  intermediate  step.  Usually  the 
intermediate  step  was  in  the  "y"  direction  and  the  boundary  conditions  were 
applied  as  if  the  intermediate  results  were  physical  quantities,  that  is, 
the  boundary  conditions  of  equation  (20)  were  applied  to  the  quantities 
AU*  of  equation  (2) . 


Explicit  wall  boundary  treatments  are  generated  by  applying  the 
primitive  variable  form  of  equation  (20)  and  forcing  the  correction  matrices 
to  be  zero. 

NUMERICAL  RESULTS 


Three  geometries  were  selected  for  detailed  study:  an  inviscid  super¬ 
sonic  diffuser  with  weak  oblique  shock,  supersonic  in  -  supersonic  out;  an 
inviscid  supersonic  diffuser  with  a  strong  normal  shock,  supersonic  in  - 
subsonic  out,  and  a  viscous  supersonic  diffuser  with  weak  oblique  shock 
illustrating  a  shock-boundary  layer  interaction.  Sketches  of  the  geometries 
and  ideal  solutions  are  shown  in  figures  2,  3  and  4.  Solutions  for  each 
geometry  were  run  to  steady  state  for  a  range  of  time  step  sizes.  For  con¬ 
venience  time  step  sizes  are  reported  in  terms  of  x  and  y  CFL  numbers: 

'At(u  +  c)  . 

(CFLx)  =  maximum  - ^  (22) 

i'j  . 

-At(v  +  c)  .  .• 

(CFL)  =  maximum  .  ■  ■  .  -VJ.  (23) 

y  Ay*  4 

J 


The  time  step  size  was  uniform  over  each  calculation  which  results  in 

non-uniform  CFL  and  CFL  numbers.  The  maximum  value  of  each  is  reported, 
x  y 

Sample  convergence  history  plots  are  shown  in  figure  5  which  shows  the  log 
of  the  value  of  the  point  maximum  steady  state  residual 


SSR 


3e  _  3r  3f  _  3s 
3x  3x  +  "Sy1  9y 


(24) 


plotted  against  the  iteration  number.  A  solution  was  not  termed  stable 
unless  the  residual  converged  to  the  machine  accuracy,  about  1  x 10  .  All 


calculations  used  a  32  bit  floating  point  word  size. 


39 


Each  geometry  calculation  was  run  with  fully  explicit  extrapolations, 

Au  =  0,  and  with  fully  implicit  extrapolations,  and  the  results  summarized 
in  Table  1.  The  most  interesting  of  these  results  are  shown  in  figure  5. 

At  a  time  step  size  corresponding  to  a  CFL^  number  of  15  convergence  was 
rapid  and  very  nearly  monotonic  in  time.  At  smaller  time  step  sizes,  the 
convergence  was  slower  but  nearly  monotonic.  At  a  CFL^  of  45  convergence 
rates  initially  appeared  to  be  faster  than  for  a  CFL^  of  15,  but  the  final 
residual  values  oscillated  significantly  about  its  minimum  value.  At  a 
CFL  of  90,  the  convergence  rate  was  substantially  slower  than  at  a  CFL 

X  X 

of  15,  and  at  larger  CFL^  values  the  solution  diverged. 

The  results  for  the  strong  shock  diffuser  can  reasonably  be  compared 
to  those  of  Yee,  Beam  and  Warming  [4_] .  They  reported  a  CFL  number  stability 
limit  between  10  and  20,  while  we  found  stability  limits  between  90  and  150. 
Thus  the  analysis  in  one-space  dimension  does  appear  to  provide  a  sufficient 
condition  for  stability,  but  it  may  not  provide  a  close  approximation  to  the 
stability  limit.  However,  it  is  essential  to  emphasize  that  the  largest 
convergence  rates  were  observed  at  time  steps  corresponding  to  CFL  numbers 
of  order  10  and  that  only  a  marginal  computational  time  advantage  for  the 
implicit  boundary  formulations  was  observed. 

The  results  for  the  shock-boundary  layer  calculation  are  very  inter¬ 
esting,  but  they  demonstrate  a  substantial  computational  advantage  for  the 
implicit  solid  wall  conditions,  not  for  the  inflow/outflow  extrapolation. 

Here  the  stability  boundary  and  the  best  convergence  rates  were  observed  at 
time  step  sizes  corresponding  to  CFL^  numbers  of  5  to  10.  When  using  the 
implicit  wall  conditions,  the  algorithm  stability  appeared  to  be  independent 
of  grid  spacing  in  the  normal  direction  as  might  be  hoped.  When  using  the 
explicit  wall  condition,  the  algorithm  stability  was  limited  to  a  CFL  number 


of  about  500. 


CONCLUSIONS  AND  DISCUSSION 


While  it  is  difficult  to  generalize  from  only  a  few  test  examples,  it 
is  apparent  that  a  better  appreciation  of  the  role  boundary  treatments  play 
in  implicit  algorithms  has  allowed  the  development  of  far  more  robust  Beam 
and  Warming  type  solvers.  For  both  explicit  and  implicit  boundary  treatments, 
we  were  able  to  accurately  compute  solutions  with  time  steps  50  to  100  times 
explicit  time  limits  while  retaining  the  ability  to  choose  rather  arbitrary 
initial  conditions.  In  many  cases,  our  limiting  time  steps  for  the  two- 
dimensional  test  problems  were  in  fact  larger  than  the  limit  which  a  one¬ 
dimensional  analysis  would  suggest. 

The  most  important  computational  result  we  observed  was  that  while  an 
improved  appreciation  of  boundary  treatments  did  allow  very  large  time  step 
sizes  to  be  used,  the  largest  convergence  rates  to  steady  state  were  observed 
at  relatively  small  time  step  sizes.  For  the  two-dimensional  test  problems, 
the  best  CFLx  numbers  were  of  order  10,  not  of  order  100.  One-dimensional 
test  examples  showed  no  such  convergence  rate  behavior.  Presently  unpublished 
analysis  by  Abarbanel,  Dwover  and  Gottlieb  [15]  has  linked  this  behavior  to 
the  approximate  factorization  form  of  equations  2  and  3.  This  effect  now 
seems  to  be  setting  the  time  step  sizes  for  our  viscqus  flow  computations 
and  new  work  should  focus  on  methods  for  overcoming  this  limitation. 
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Figure  2.  Computational  Grid  for  Weak  Shock  Diffuser  Calculations 
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Figure  3.  Computational  Grid  for  Strong  Shock  Diffuser  Calculations 
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Figure  4.  Computational  Grid  for  Shock-Boundary  Layer  Computation 


GRID  GENERATION  EXPERIENCE 


Summary 


One  of  the  most  difficult  tasks  to  be  performed  as  part  of  accurate 
computations  for  turbine  blade  geometries  is  grid  generation.  As  part  of 
our  efforts  in  grid  generation,  we  have  constructed  a  coordinate  generating 
scheme  [BOGG]  based  on  elliptic  (Poisson)  equation  solutions  which  seem 
attractive  for  internal  flow  calculations.  These  coordinate  systems  are  either 
orthogonal  at  boundaries  or  may  have  any  specified  angle,  are  periodic  upstream 
and  downstream  of  blade  rows,  and  may  have  arbitrary,  user-specified  spacing 
near  boundaries.  This  grid  system  does  not  correct  the  problem  of  highly 
sheared  grids  and  does  not  provide  adequate  grid  resolution  in  the  far  field. 
Analysis  of  the  calculational  results  indicates  that  the  present  use  of  the 
strong  conservational  law  Navier-Stokes  form  and  central  finite  differencing 
is  the  problem  rather  than  the  grid  systems.  Investigations  are  continuing 
to  determine  if  simple,  sheared  grid  systems  may  be  used  in  place  of  the  more 


complex  BOGG  grids. 


BOGG  -  Introduction 

One  of  the  major  problems  in  developing  an  efficient  numerical  calcu¬ 
lational  scheme  is  the  choice  of  a  proper  coordinate  system,  capable  of 
transforming  a  complicated  physical  domain  into  a  rectangular,  evenly  spaced 
calculational  domain.  In  the  viscous  flow  case,  the  coordinate  system  must 
give  a  high  resolution  region  close  to  solid  boundaries  in  order  to  capture 
the  boundary  layer,  and  its  lines  should  be  normal  to  this  boundary. 
Especially  demanding  is  the  grid  development  for  transonic  compressor  and 
turbine  cascades  due  to  their  complicated  shape,  the  complex  flow  structure 
and  the  presence  of  periodic  boundaries.  The  practical  problem  usually 


starts  with  a  given  grid  point  distribution  along  the  boundaries  coupled 
with  some  desired  properties  of  the  coordinate  lines  at  the  boundaries, 
and  seeks  a  one-to-one  napping  of  the  points  from  the  physical  domain  into 
the  computational  domain.  In  recent  studies,  many  new  methods  of  numerical 
coordinate  system  generation  have  been  developed  ([l]-[7]).  Most  of  these 
methods  are  based  on  the  solution  of  Poisson  equations  with  Dirichlet 
boundary  conditions,  giving  unique  and  continuously  differentiable  trans¬ 
formation  functions.  Only  few  of  them,  however,  provide  a  direct  control 
of  the  line  direction  and  spacing.  Thomas  [3]  provides  analytical  ex¬ 
pressions  for  the  controlling  functions,  but  this  method  fails  to  produce 
the  desired  results  in  all  but  the  simplest  cases.  Sorenson  and  Steger  [4] 
developed  an  effective  method  of  controlling  the  spacing  and  angles  at 
boundaries  for  0-  and  C-type  grid  systems.  Warsi  and  Thompson  [6]  introduced 
a  non-iterative  method  for  the  numerical  generation  of  orthogonal  curvilinear 
coordinates  for  plane  annular  regions  between  two  smooth  closed  curves. 
Although  most  of  these  methods  are  adequate  for  some  problems,  they  are  not 
fully  suitable  for  grid  generation  in  internal  flow  calculations.  The  grid 
generation  method  for  this  type  of  problem  has  to  provide  a  positive  control 
of  the  grid  lines  at  all  boundaries,  give  the  required  spacing,  generate 
coordinate  lines  without  excessive  skewness  while  being  simple,  fast  and 
generally  applicable. 

BOGG  uses  a  system  of  grid  generating  elliptic  (Poisson)  equations 
similar  to  those  in  [3],  but  it  introduces  a  simple  and  effective  iteration 
method  of  controlling  the  coordinate  line  angles  at  all  boundaries.  Coor¬ 
dinate  systems  were  developed  for  different  inviscid  and  viscous  internal 
flow  problems  and  used  with  success  in  an  implicit  two-dimensional  numerical 


scheme  [11] 
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30GG  Basic  Formulation 

The  present  method  seeks  a  coordinate  system  transform 
£  =  £(x,  y)  and  n  =  r\(x,  y) 
as  a  solution  of  the  Poisson  equations 

£  +  £  =  p(£  ,  n) 

xx  yy 


n  +  n  =  Q(£  /  n) 

xx  yy  * 

where  x,  y  are  the  Cartesian  coordinates  in  the  physical  domain  ft,  and 
£  ,  r|  are  rectangular  coordinates  in  the  calculational  domain  (Fig.  1) . 

The  coordinate  lines  £  =  const  were  in  most  of  the  cases  aligned  streamwise 
and  included  solid  surfaces;  the  coordinate  lines  n = const  were  usually 
normal  to  the  streamwise  direction  at  the  boundaries.  Equation  (1)  is 
subject  to  the  Dirichlet  boundary  conditions  £b  =  £b(x,y)  and  (x,y)  . 

The  right-hand  side  functions  P  and  Q  control  the  spacing  of  the  £  and  n 
lines  inside  the  domain  ft.  Since  the  values  of  £  and  n  are  fixed  on  the 
boundary,  a  change  in  spacing  within  ft  will  produce  a  change  of  the  angle 
at  which  the  £  and  r|  lines  intersect  the  boundary.  The  correct  value  of 
P  and  Q  is  not  known  a  priori,  so  that  an  iterative  procedure  over  the 
system  (1)  is  necessary  to  find  P  and  Q. 

The  effect  of  the  functions  P(£,ri)  and  Q(£  ,  n)  on  the  resulting 
coordinate  system  can  be  seen  by  comparing  the  solution  of  (1)  with  the 
corresponding  homogeneous  system 

£  +  £  =0 

“  yy  (2) 

v  *  v =  0 

Let  £,n  be  the  solution  of  (1)  and  £*,  f|*  the  solution  of  (2)  ,  subject 


to  the  same  boundary  conditions  as  (1) ,  Then,  in  a  subregion  D  of  ft,  a 


negative  P (£  ,  n  )  will  cause  a  line  £  =  i  =  const  to  move  closer  to  the  line 
m 

★ 

£  than  the  line  £  -  l,  since  £  and  n  are  subharmonic  on  D.  Similarly,  a 
m 

negative  0(  £,  r|  )  will  move  a  line  n  =  £  =  const  closer  to  the  line  f!  than 
m  m 

* 

the  line  n  = l.  Positive  values  of  P  and  Q  have  the  opposite  effect  on 

the  £  and  r|  lines.  If  the  curve  £  =  £  is  now  a  branch  cut  in  D,  this  branch 

m 

cut  will  move  in  the  direction  of  increasing  £  for  a  positive  P(£  ,  n ) . 

m 

The  magnitude  of  P  and  Q  determine  the  magnitude  of  the  movement  of  the 

coordinate  lines;  the  sign  of  P  and  Q  determine  the  direction  of  this  movement 

The  situation  at  a  boundary  n  =  const  =  n  .  is  shown. in  Fig.  2a.  Here  the 

mm 

position  of  the  £  =  const  line  on  the  boundary  is  fixed  by  the  boundary  con¬ 
dition  £  =  £,  (x,y)  .  A  negative  increment  in  P(  £,f|  .  )  will  cause  the  line 
b  mm 

£  to  move  inside  of  ft  in  the  direction  of  increasing  £.  This  movement  will 

— ► 

decrease  the  angle  y  between  the  line  £  =  const  and  the  tangent  t  to  the 

boundary.  A  positive  increment  in  P(  £,  n  .  )  will  cause  a  movement  of  the 

mm 

£  line  in  opposite  direction.  A  proper  value  of  P  can  therefore  give  the 

desired  angle  y=8,  0  <  8  <  tt,  Here  it  should  be  remembered,  however,  that 

the  influence  of  P(  £,  n  .  )  is  not  limited  to  the  line  £  but  affects  the 

mm 

coordinate  lines  in  the  entire  region.  In  most  of  the  practical  cases  the 

angle  forcing  requirement  is  not  limited  to  one  point  but  extends  at  least 

over  some  part  of  the  boundary  n  =  const.  This  means  that  P(£  ,  n )  has  to  be 

found  for  all  points  simultaneously  rather  than  by  pointwise  calculation. 

Fig.  2b  shows  the  analogous  case  of  the  influence  of  Q  on  r|  =  const 

lines  at  the  boundary  £  =  const  =  £  .  Here  a  oositive  increment  of 

max 

Q(£  ,  n )  will  decrease  the  angle  y. 


For  practical  calculations  the  inverse  relationship 

x  =  x(  5  ,  n) 
y  =  y(  Z,  n) 

is  needed.  The  inverse  form  of  the  system  (1)  is  (see  for  example  [1]! 


(3) 


"  2fot5n  +  yxim  *  'J  <PV!V 


aycr  2Sy?n  +  Yynn  =  'J  <PV8yn> 


(4) 


where 


2  2 
a  =  x  +  v 

n  *n 


3  =  xnx?  +  yny? 


Y  =  4. 


J  -  xry  -  x  Yr  =  Jacobian  of  the  transform 

£  n  n  5 


It  is  of  advantage  to  define  new  spacing  controlling  functions  as 


<J>  =  P  — 

a 


(5) 


The  values  of  the  functions  <*>  and  il>  are  several  orders  of  magnitude 
smaller  than  P  and  Q,  The  above 


explanation  for  the  effect  of  P  and  Q  on  the  grid  lines  applies  to  <f>  and 


2  2 

ip  without  restrictions  since  J  /a  > 0  and  J  /y  >  0.  With  (5),  the  equations 


(4)  become  finally 


26x5n  +  Y!W  + 


(6) 


aycr2ByCn  +  Yynn=  (a<ty5  +  Y%) 


subject  to  the  boundary  conditions  x  =  x^(^  ,n)  y  =  yv)(^frl)  .  Once  6 

and  ip  are  chosen,  the  system  (6)  can  be  solved  using  an  appropriate  numerical 


method. 


Choice  of  j>  and  ip 


The  previous  discussion  illustrates  that  the  angle  Y  at  the  boundaries 

is  determined  by  the  function  <j>  at  f)  =  constant  boundaries,  and  the  function 

at  £  =  constant  boundaries.  Starting  from  initial  values  of  <£  and  ip,  the 

controlling  functions  have  to  be  adjusted  in  an  iterative  manner  to  give 

the  required  angles  (8)  between  the  coordinate  line  and  the  boundary. 

In  general  the  left  boundary  is  defined  by  C  =  £  .  ,  the  right  boundary 

mm 

by  £  =  E,  ,  and  the  lower  and  upper  boundaries  by  n  =  ti  .  and  n  =  n 

max  mm  max 

respectively.  Solid  walls  are  generally  located  on  part  or  all  of  the  range 

g  =  5  .  to  5  at  Tl  *  n  ,  and  rj  =  .  The  most  common  requirement  for 

mm  max  mm  max 

this  type  of  grid  is  that  the  coordinate  lines  are  orthogonal  to  the 
boundaries  (6=^/2,  a  =  0).  Specification  of  an  angular  deviation  from 
orthogonality  is,  however,  allowed  (i.e.  the  angle  a  is  specified  non-zero) . 
The  following  discussion  applies  both  to  orthogonal  and  non-orthogonal  grid 
line  intersections  at  the  boundaries. 

The  procedure  necessary  to  obtain  the  desired  grid  angle,  8,  at  the 
lower  boundary  will  now  be  discussed.  The  nomenclature  used  in  the  following 
section  is  as  defined  in  Fig.  2a.  It  is  assumed  in  this  correction  procedure 
that  any  change  in  the  forcing  function  <f>  will  have  6nly  a  localized  effect 
on  the  coordinate  lines. 

It  was  stated  in  the  previous  discussion  that  decrementing  the  forcing 
function  $  (i.e.  will  reduce  the  angle  y.  Similarly  incrementing  <}>  will 
increase  the  angle  y.  The  following  conditions  therefore  apply. 


Y  >  8  we  require  A <p  <  0 
y  <  8  we  require  A<J>  >  0 


(7) 
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Choice  of  an  incremental  function  to  satisfy  these  criteria  will  force  the 
grid  lines  to  the  required  orientation.  The  forcing  function  <p  is  modified 
between  iteration  counts  i  and  1+  1  using  the  equation 

J  +  1  i  i 

4»<C  #n_.  )  =  *<  ?,n.  )  +  A<j>  (8) 

mm  mm 

l 

The  correction  factor,  Ac})  ,  used  in  this  computation  is  given  by 

A$a  =  ~sin^~  (A  >  1)  (9) 

This  expression  satisfies  the  movement  criteria  expressed  in  (7) .  The  damping 
factor  A(  > 1.0)  is  introduced  to  control  iteration  stability. 

A  similar  analysis  at  the  upper  boundary  leads  to  the  following 


iteration  sequence: 

x,,-  _  >£+l  x.r  _  .  sin  (Y  ■  8) 

C/ n  )  + — - -  do) 

ma.x  max  A 

The  value  of  <p  inside  the  domain  ft  is  obtained  by  linear  interpolation  along 
a  line  £  =*  constant. 


<D(£  ,n)  =  <M£,  n  .  )  +  ,n  >  -  <P(  £,n  .  )| 

mm  max  mm  in  -  n 

v  max  mm 


At  the  boundary  £  =  £  ,  ^  is  determined  by  the  sequence 

max 

i  ,  r  .2+1  ,  .2-  sin(y-B)  ,, 

'ML,  ,n)  =  +  - b — L  ' 

max  max  B 

and  at  £  =  £  .  . 

mm 

,  ,  2+1  ,  ,,  .2  sin(y-S) 

•  ,  n)  =  'MC-  ,  n) - r1 -  (; 

mm  mm  B 

The  angles  used  in  equation  (12)  are  defined  in  Fig.  2b.  Between  these 
boundaries  :p  is  calculated  along  a  line  f|  =  const  by  linear  interpolation. 


H»(5  ,n)  -  i|)(5  .n  ,n)  +  ,n)  -  H»(£m.n  ,n)  ? - <U) 

mm  max  min  i  t  -  *  . 

;  [  max  rain) 


If  no  specific  requirement  is  enforced  on  the  grid  intersection  angles  at 
any  boundary,  the  forcing  function  remains  constant  throughout  the  iteration 
procedure. 

The  above  correction  procedures  require  evaluation  of  the  sine  of 
the  angular  error.  This  sine  can  be  expanded  (on  the  lower  and  upper 
boundaries)  to: 

sin(y-S)  =  sin[(y+0)  -  (8+0)1 

=  sin(y+0)  cos (8+0)  -  cos(Y+0)  sin (8+0)  (15) 

where  the  angles  are  defined  in  Fig.  2a  for  the  lower  boundary.  An  angular 
deviation  (a)  from  orthogonality  may  be  specified  so  the  above  equation  is 
rewritten  in  terns  of  a,  Y  and  0  as: 

sin(Y-S)  =  -sin(Y+0)  sin(0-a)  -cos(Y+0)  cos(9-a) 

The  angle,  0-ct,  is  constant  throughout  the  iteration  procedure  and  is  there¬ 
fore  only  calculated  once.  The  sine  and  cosine  of  the  angle  between  the 
E,  =  constant  line  and  the  x-axis  (y+0)  are  evaluated  using  the  equations: 


where 

ri  =  n  ■  ,  ,  on  the  lower  boundary 

p  mm  +  l 

n  ■  1.  on  the  upDer  boundary 

p  r^ax 

On  the  left  and  right  boundaries,  the  sine  of  the  angular  error  is  expressed 
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sin(y-S)  =  sin[(y-0)  -  (3-0)] 

=  sin(y-0)  cos(B-0)  -  cos(y-3)  sin (6-6) 

=  sin(y-B)  sin(3+a)  -  cos(y-0)  cos(0+a)  (17) 

The  angle,  0+a  ,  is  again  constant  throughout  the  iteration  procedure,  and 
the  sine  and  cosine  of  the  angle  between  the  r|  =  constant  line  and  the  x-axis 
(Y-0)  are  evaluated  using  the  equations 


sin (y-0) 


-(y(5p/n)  -  y(^p-1/n)) _ 

/  (x(5p,n)  -  x(5p_1,n)l2  +  fy^p'H)  -  y(5p_1,n)) 2 


cos (y-0)  = 


x(£  ,n)  -  x(C  ,n) 

_ 2 _ El.1—  . 


/  (x(S  ,n)  -  x(£ »n))  +  (y(C  »n)  -  y(£  ,»n)V 

F  u  x  M  r  * 


(18) 


(19) 


where 


£  =  C  .  ,  on  the  left  boundary 

p  min  +  1 

C  =  £  on  the  right  boundary 

p  max  ’ 


BQGG  Iteration  Procedures 

Once  the  functions  <p  and  ip  are  established  the  system  of  Poisson 
equations  (6)  can  be  numerically  solved  using  finite  difference  methods. 

The  scheme  selected  was  successive  line  over  relaxation  (5L0R) .  SLOR  is 
first  applied  along  £  lines  and  then  n  lines  for  a  number  of  iterations. 

The  forcing  functions  are  then  corrected  and  the  procedure  repeated. 

The  over-relaxation  factors  used  in  the  SLOR  calculations  were 
generally  taken  as  1.15  for  evaluation  along  £  lines  and  1.05  along  n  lines. 
The  damping  factors  applied  to  the  forcing  function  corrections  [Eqs.  (9) , 
(10),  (12),  and  (13)]  were  generally  the  same  and  had  values  in  the  range 


1  to  5 


BOGG  Calculation  Examples 

The  simplest  type  of  coordinate  system  was  that  of  the  diffuser,  which 

consists  of  an  upstream  boundary  £  =  0,  downstream  boundary  £=  i  ,  solid 

max 

wall  boundary  at  n=  0  and  either  solid  wall  or  symmetry  line  at  r)=  j 

max 

The  lines  £ = const  were  required  to  be  normal  to  the  n  boundaries;  there  was 
no  specific  requirement  for  the  n  =  const  lines  at  the  £  boundaries  in  the 
inviscid  case.  The  viscous  grid  has  geometric  point  distribution  along 
the  £  lines  allowing  proper  boundary  layer  resolution.  The  spacing  in  the 
£  direction  is  constant.  The  maximum  error  criteria  for  0  was  0.2°.  The 
resulting  grid  systems  can  be  seen  in  Figs.  3,  4  and  5. 

Compressor  cascades  are  difficult  geometries  for  developing  coordinate 
systems  suitable  for  efficient  viscous  and  non-viscous  calculations.  The 
blade  shapes  chosen  for  examples  were  designed  by  Tong  [10]  with  help  of  an 
inverse  numerical  code.  The  physical  domain  consisted  of  the  region  between 
the  two  blades  and  an  upstream  and  downstream  region  bounded  by  blade  chord 
extension  lines.  Along  these  lines,  the  periodic  boundary  conditions  have 
to  be  applied,  requiring  the  same  point  distribution  at  the  upper  and  lower 
boundary  upstream  and  downstream  of  the  blade,  respectively.  The  upstream 
(£  =  0)  and  downstream  (£  =  40)  boundaries  were  located  one  chord  length  from 
the  leading  and  trailing  edge,  respectively.  The  viscous  grid  has  the  first 
and  last  n  lines  densely  packed  (geometric  distribution)  in  order  to  capture 
the  boundary  layer;  the  remaining  n  lines  are  uniformly  spaced.  In  the  non- 
viscous  grid  all  the  n  lines  are  uniformly  spaced.  The  £  lines  ahead  and 
behind  the  blade  have  geometric  spacing  to  provide  higher  resolution  close 
to  the  leading  and  trailing  edge.  The  maximum  error  of  0  was  0.5°. 


The  resulting  grids  are  shown  in  Figs.  6  and  7.  These  grids  were  used  with 
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considerable  success  by  Bush  [9,  11]  in  his  two-dimensional  viscous 
implicit  code. 

The  most  demanding  example  was  to  generate  a  grid  for  finite  difference 
calculations  of  the  flow  properties  in  a  turbine  cascade.  The  transonic 
turbine  blade  shape  was  the  same  as  used  by  Demuren  [12],  The  £ = 0  line 
is  the  U-shaped  upstream  and  periodic  boundary  extending  from  the  leading 
edge  of  the  lower  blade  to  one  chord  length  ahead  of  the  blade  to  the 
leading  edge  of  the  upper  blade.  The  periodic  boundary  is  located  at  the 
first  and  last  8  points  on  the  line  £  =  0;  between  these  two  regions  is  the 
upstream  boundary.  The  downstream  boundary  corresponds  to  £  =  40. 

The  inviscid  grid  in  Fig.  8  has  30r)  =  constant  lines.  The  viscous  grid 
has  60ti  =  constant  lines,  where  the  first  and  last  20  points  have  geometric 
distribution  that  allows  good  resolution  of  the  boundary  layer,  see  Fig.  9. 

The  requirement  for  the  £-lines  was  again  to  be  normal  to  the  n  =  0  and 
0=  jmax  boundaries.  The  resulting  C-type  grids  are  shown  in 

Figs.  8  and  9. 

The  versatility  of  the  present  method  can  be  best  demonstrated  on  the 

example  of  grid  generation  in  the  axial-radial  plane  of  a  transonic  compressor 

with  both  rotor  and  stator.  The  grid  consists  of  five  different  regions 

(upstream  of  rotor,  rotor,  between  rotor  and  stator,  stator  and  downstream 

of  stator)  that  are  treated  separately  and  then  joined  together.  In  this 

case,  both  line  types  £= constant  and  n =  constant  are  controlled  at  the 

corresponding  boundaries.  The  angle  between  the  £= constant  line  and  the 

upper  and  lower  boundary  changes  gradually  from  tt/2  to  the  angle  between 

£=i  line  and  either  the  line  n  =  0  or  n  =  j  .  The  n  =  constant  lines 
max  max 

max 


are  manipulated  at  the  £  =  0  and  £  =  i 


boundaries  to  give  smooth  transition 
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between  two  adjoining  regions.  The  £ = constant  lines  have  geometric  dis¬ 
tribution  near  the  leading  and  trailing  edges.  There  are  altogether 
98  £-lines  and  17  n-lines.  The  resulting  coordinate  system  is  shown  in 
Fig.  10. 

BOGG  Input  Details 

BOGG  requires  two  data  files  for  input.  One  contains  the  coordinates 
of  the  grid  boundary,  the  other  switches  and  factors  required  during  the 
calculation  procedure. 

The  input  files  have  the  following  format: 

X-Y  coordinate  file 

X,Y  Values  on  the  lower  surface  -  format  [2X, (8E14  . 7)  ] 

X, Y  values  on  the  upper  surface 
X, Y  Values  on  the  left  surface 
x, Y  Values  on  the  right  surface 
Setup  file 

Line  1:  Number  of  n  lines,  number  of  £  lines  -  format  (2X,2I5) 

Line  2:  Number  of  forcing  function  correction  iterations,  number 
of  SLOR  iterations  -  format  (2X,2I5) 

Line  3:  Damping  factors  A  and  B  -  format  (2X,2fl0.6) 

Line  4:  Over-relaxation  factors  applied  along  £  lines  and  f|  lines  - 
format  (2X,2fl0.6) 

Line  5:  Minimum  angle  deviation  for  convergence  (RAD)  -  format 


(2X, flO  .6) 

Line  6:  Intersection  angle  options  on  lower,  upper,  left  and  right 


boundaries  ISTART,  IEND,  IANG  -  Format  (2X,3I5) 


end  of  the  angle  option  and 

IANG =  0  denotes  intersection  angle  not  forced 
IANG  =  1  denotes  grid  line  orthogonal  to  boundary 
IANG  =  2  denotes  angular  deviation  from  orthogonality 
specified. 

If  IANG  equals  2,  the  angular  deviation  (RAD)  is  then 
input  -  format  [2X, (8E14. 7) ] . 

BOGG  Output  Details 

The  program  generates  an  X- Y  coordinate  file  of  grid  intersection 

locations.  At  each  t,  =  constant  line  (moving  from  £  .  to  C  ),  the  X 

mm  max 

coordinates  are  output  for  r)  =  n  .  to  rt  =  n  [format  f2x(8E14. 7)  1  ]  ,  followed 

mm  max 

by  the  Y  coordinates  in  the  same  order. 

Geometry  Interpolation  Codes 

A  series  of  computer  codes  has  been  written  as  an  interface  between 
Rolls-Royce  turbine  geometric  description  and  standard  GTL  usage  [13], 

A  pre-processor  reads  a  Rolls-Royce  proconsul  file  of  blade  coordinates 
and  generates  blade  profile  coordinates  in  a  form  compatible  with  standard 
GTL  usage.  Two  more  codes  interpolate  this  data,  using  a  Rolls-Royce  supplied 
spline  interpolation  routine,  on  to  user  specified  X.cuts  and  interface  with 
the  grid  generation  programs.  A  fourth  program  has  been  written  to  inter¬ 
actively  display  and  revise  the  resulting  geometries  and  grid  lines. 
Grid-Related  Errors 

As  a  test  case  for  turbine  type  geometries,  an  inviscid  calculation 
of  the  flow  through  a  turbine  cascade  was  attempted. 

The  grid  chosen  was  a  throughflow  type  grid  as  shown  in 

Figure  11.  This  grid  had  40  points  in  the  axial  direction  and  20  points 

in  the  tangential  direction.  Calculations  based  on  this  grid  were  extremely 
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disappointing  especially  in  terms  of  stagnation  pressure  error.  A  contour 
plot  of  stagnation  pressure  loss  for  this  turbine  cascade  is  shown  in 
Figure  12.  upstream  of  shock  waves,  any 
stagnation  pressure  loss  is  to  be  considered  a  solution  error.  The 
maximum  error  is  about  15%  near  the  leading  edge,  but  a  10%  error  is  ob¬ 
served  in  the  inlet  region  where  the  solution  should  and  does  have  very 
nearly  uniform  velocity. 

Since  this  type  of  stagnation  pressure  error  is  known  to  be  common 
to  most  time  marching  algorithms,  considerable  time  was  devoted  to 
analyzing  the  error  mechanism  for  Beam  and  Warming  type  codes.  It  was 
felt  that  there  were  two  distinct  error  mechanisms,  one  associated  with 
solid  wall  boundary  conditions  and  a  second  one  associated  with  the 
truncation  error  of  the  scheme.  It  was  found  that  the  stagnation  pressure 
errors  upstream  of  the  blade  rows  correlated  well  with  the  second  mechanism 
and  were  not  due  to  contamination  of  the  solution  by  boundary  condition 
errors. 


In  order  to  understand  the  truncation  error  of  the  steady  state 
finite  difference  solutions,  it  is  necessary  to  examine  the  strong  con¬ 
servation  law  form  of  the  equations  being  solved  which  are  the  two 
dimensional,  transformed,  Euler  equation: 


U.  +  E_  +  F  =  0 

t  t,  n 


E  =  (C  E  +  $  F)  J 
x  y 

f  =  (n  e  +  n  f)  j 
x  y 


(20) 


(21) 
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In  operator  notation  we  have 


“’“’j+iVj+i  •  (pu)j-i  Vj-i +  (Pu,k-isjWpu)k+isjVi 

+  -  ""’j+Avi +  <pv)k*i5jVi  -  “hVh  * 0 


(26B) 


This  flux  balance  has  the  correct  cell  face  area  associated  with  the  appro¬ 
priate  flux  terms,  but  the  flux  across  any  face  is  approximated  as  the 


value  at  the  face  mid-node 


(pu)  f°r  example.  This  flux 


balance  is  common  to  all  schemes  using  the  strong  conservation  law  form  and 
2nd  order  central  differencing. 

One  estimate  of  the  error  in  this  flux  balance  can  be  obtained  if  one 
estimates  the  same  flux  balance  on  a  grid  twice  as  fine  as  the  present  grid, 
as  illustrated  in  Figure  15.  The  new  flux  sum  will  be: 

^j+~,k  "  ,k  *j,k+y  “  ^j,k-y 

As"  ?  * - hs - 1271 

We  will  estimate  the  term,  E.  1  .  ,  as: 

3+y,k 

r  ?  5 

E.  1  =  E  ~  +  F 

3  2  ,k  J  J  'j+-k 

J  2 


=  —  (E  +  E  )(5y  6  y  )  -  (F  +  F .  )  (6,  X.  ,  +  6,  X  . ) 

4  j+l,k  j ,k'  ky]+l  }+l,k  ] ,k  k  j+l  k  j 

If  we  subtract  equation  (26B)  from  equation  (28)  we  get  a  vector  error 

estimate  of: 

(Vj’AV  *  AVAV 

*lli\llVj|tlVi,l,iV  <29) 

Note  that  £  is  a  vector,  not  a  scalar  quantity.  Figure  16  shows  a  contour 


plot  of  C  ,  error  for  energy  equation,  for  the  turbine  cascade.  The  corres¬ 
pondence  between  this  error  parameter  and  the  stagnation  pressure  error  shown 
in  Figure  12  is  encouraging. 


An  unexpected  result  on  this  analysis  is  that  one  terra  in  equation 
(29)  is  zero  for  sheared  grids  like  those  shown  in  Figure  17  .  For  this 
grid,  all  y  running  coordinate  lines  are  parallel,  and  c^x..  =  0.  This  fact 
suggests  that  a  sheared  grid  might  reduce  the  stagnation  pressure  error  and 
Figure  18  demonstrates  that  indeed  it  does. 

Conclusions 

The  status  of  the  grid  generation  effort  is  at  present  very  open-ended. 
The  grid  error  analysis  shows  that  the  scheme  truncation  error  can  be  made 
as  small  as  desired  when  the  grid  spacing  is  chosen  small.  In  addition, 
one  possible  interpretation  of  the  error  analysis  is  that  sheared  grids 
will  be  adequate  even  for  turbine  type  geometries.  Considerable  effort  is 
now  being  devoted  to  determining  if  blunt  leading  and  trailing  edges  and 
thin  shear  layers  can  be  satisfactorily  analyzed  with  these  grids.  Until 
these  sheared  grids  can  be  shown  to  be  inadequate,  most  computations  will 
be  done  with  the  sheared  grids  rather  than  the  BOGG  grids.  The  BOGG  grids 
do  offer  good  leading  edge  resolution,  but  they  do  not  solve  the  problem 
of  sheared  grid  lines,  see  Figure  11  trailing  edge  region.  In  addition  the 
far-field  structure  of  these  grids  also  appears  to  be  a  source  for  stagnation 
pressure  errors . 

Analysis  of  the  present  Beam  and  Warming  type  algorithmic  use  of  the 
strong  conservation  law  forms  shows  that  the  present  algorithm  flux  balance 
is  not  very  accurate.  A  primary  goal  for  future  code  development  is  to 
improve  this  flux  balance.  Three  possible  methods  to  improve  the  flux  balance 
are:  one,  introduce  fourth  order  accurate  finite  difference  forms  on  the 
steady  state  solution  evaluation;  second,  introduce  an  improved,  specialized 
flux  balance  like  the  error  analysis  in  the  steady  state  solution  evaluation; 
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and  third,  introduce  a  combined  finite  difference  -  finite  element  formulation 
to  provide  the  most  accurate  solutions.  Each  of  these  schemes  will  be 
investigated  to  determine  their  possible  application. 
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AN  IMPLICIT,  bi-diagonal  numerical  method 
FOR  SOLVING  THE  NAVIER-STOKES  EQUATIONS 
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W.T.  Thompkins,  Jr.** 
Massachusetts  Institute  of  Technology 
Cambridge,  Massachusetts 


Abstract 

In  recent  years,  much  progress  has  been  made  in 
solving  fluid  dynamical  problems  -using  finite  dif¬ 
ference  methods.  Solving  inviscid  compressible 
problems  in  two  and  three  dimensions  has  become  al¬ 
most  routine  with  many  suitable  methods,  explicit 
or  implicit,  available.  The  problem  of  compressible, 
viscous  flows  in  complicated  geometries  remains, 
however,  a  major  challenge.  Here  the  fine  mesh 
spacing  in  the  boundary  layer  region  makes  the  ex¬ 
plicit  methods  with  their  simple  boundary  conditions, 
impractical.  Existing  implicit  methods  can  make  use 
of  large  time  steps,  but  require  costly  inversions 
of  large  block-tridiagonal  matrices.  A  method  re¬ 
cently  developed  by  MacCormack  eliminates  this  dis¬ 
advantage  by  introducing  a  predictor-corrector  scheme 
requiring  the  inversion  of  only  block  bi-diagonal 
matrices.  It  is  the  aim  of  present  work  to  extend 
this  method  to  allow  solution  of  viscous,  compres¬ 
sible  problems  in  general  coordinates  for  arbitrary 
two-dimensional  geometries. 

Introduction 

In  recent  years,  much  progress  has  been  made  in 
solving  fluid  dynamical  problems  using  finite 
difference  methods.  Solving  inviscid  compressible 
problems  in  two  and  three  dimensions  has  become  al¬ 
most  routine  with  many  suitable  methods,  explicit 
or  implicit,  available.  The  problem  of  compressible, 
viscous  flows  in  complex  geometries  remains,  how¬ 
ever,  a  major  challenge. 

The  fine  mesh  spacing  required  in  viscous 
regions  makes  the  explicit  methods  with  their 
simple  boundary  conditions,  such  as  the  MacCormack 
explicit  scheme  [1,2]  impractical.  The  existing 
implicit  methods,  such  as  Beam  and  Warming  [3]  or 
Pulliam  and  Steger  [4],  make  the  use  of  large  time 
steps,  corresponding  to  Courant  numbers  of  0(102), 
possible,  but  require  costly  inversions  of  large 
block-tridiagor.ai  matrices.  A  method  recently 
developed  by  MacCormack  [5]  eliminates  this  disad¬ 
vantage  by  introducing  a  predictor-corrector  scheme 
requiring  the  inversion  of  only  block  bi-diagonal 
matrices.  The  resulting  difference  equations  are 
either  upper  or  lower  block  bi-diagonal  equations 
that  can  be  solved  easily  in  one  sweep.  Unfor¬ 
tunately,  this  method  was  demonstrated  only  for  the 
simple  case  of  flat  plate  shock-boundary  layer 
interaction.  It  is  the  aim  of  present  work  to  ex¬ 
pand  this  simple  implicit  method  to  allow  solution 
of  viscous,  compressible  problems  for  general  two- 
dimensional  geometries. 


and  time.  Then  the  second  step  is  used  to  remove 
the  stability  restriction  of  the  first  step  by 
transforming  the  equations  of  the  first  step  into 
an  implicit  form.  The  resulting  matrices  are  block 
bi-diagonal  and  can  be  easily  solved.  The  Jacobian 
matrices  of  the  governing  flow  equations  are  ex¬ 
pressed  in  a  convenient  diagonalized  form,  making 
any  matrix  inversion  unnecessary.  The  method  was 
tested  on  a  number  of  numerical  examples,  including 
incompressible  and  compressible  Couette  flow  and  a 
supersonic  diffuser  with  shock-boundary  layer 
interaction. 

Development  of  Algorithm 

The  two-dimensional  compressible  Navier-Stokes 
equations  can  be  written  in  the  conservation  law 


3u  3F('J)  3G(U) 


Du  +0 


puv  T 


(e  +  0  )u  +  T  v-km— 
t  x  xy  3x 


Duv  T 


Dv*  +  O 


(e+0)v+T  u  -  k  -c— 

(  t  y  xy  ay  J 


p  -  (y-l)  (et-i»u2+v2)  ] 


_  ,  f  3u  *  3v  ,  du 

0  *  p  -  *■  t—  +  -T—  -  2  J  ^ — 

X  r  ^dx  dy  ax 


f3u  3vl 
‘I3y  3xj 


„  .  f3u  ^  3vl  ,  3v 

y  I 3x  ay)  3y 


Equation  (1)  can  be  expressed  in  nondimens lonal 
form  by  defining  nondimensional  variables 


The  present  method  uses  as  its  first  step  the 
explicit  predictor-corrector  finite  difference 
method  of  [1J,  approximating  the  governing  fluid 
flow  equations  to  second  order  accuracy  :r.  .'pace 
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a  oat 

o  _  o  o  o 

t1  *  tr-  Re  *  — ; - 

1  o  U 

o  o 


The  primes  will  be  dropped  later  on  for  convenience. 
The  vectors  U,  F  and  G  then  become: 


’o' 

P'u* 

O'u' 

2 

p'u'^  ♦  a* 

X 

0 1  v 1 

F  - 

p‘u'v'  ♦  x* 

xy 

®  V 

1  3t* 

(e'+o'lu’+T*  v'-k’- — — -  ~ 
t  x  xy  Pr  Re  (y-l)  ox’ 

o  o 

O' v' 

0 '  u 1  v 1  +  T ' 

xy 

o*  v'  2  +  o* 

y 

1  3t' 

<•'  +  a'  )  v  +  t'  u'  -  k'  - — c~ — — 

t  y  xy  Pr  Re  ;y-l  ly1 


D  a  -  Jv  n  *  .T  X  _ 

X  y£  y  £  (4) 

The  method  of  numerical  integration  of  equation 
(2)  has  been  adopted  from  MacCormack  [51 ,  where  it 
is  explained  in  detail.  The  resulting  integration 
scheme  will  be  therefore  presented  without  detailed 
development.  Equation  (2)  is  integrated  by  the 
following  implicit  predictor-corrector  set  of  unite 
difference  equations: 


Predictor : 


a+  f."  .  a+  a.n .] 


iGi"j  *  +  j 


o n+1  -  o.n .  + 5 o n+1 

i/j 


i  A 1  fau1  3v*  u'  3u' 

X  ’P  Re  ^3x'  *  3y'J  Re  3x‘ 


V  [ 3u*  3v' 


xy  ReQ  [3y'  3x' 


.  X'  f3u’  3v'l  ,  u'  3v* 

y  *p  ~  Re  t.3x'  3y'J  Re  3y' 


Corrector: 

.  -  £  n+1  .-  i  n+1  I 

— -  A  F.  .  AG.. 

<?  Kv2 — s^l 


<f|iH  L 


1?£—  afl"*1-  Au.n+1 
An  J 


5n+1.i  fu.n .  +  5.";1  +  «g.H 


The  equation  of  state  for  perfect  gas  becomes : 

p'  -  (Y-l)  0'<u'2  +  V2)  1 

It  is  also  interesting  to  note  that: 

T*  -  -^7-  ;  a'  »  /T'  i  M  »  M*  «  jr 

In  most  of  practical  cases  it  is  necessary  to 
transform  the  Cartesian  coordinates  x,y  into  3ome 
more  general  coordinates  £(x,y),  n(x,y) .  The  strong 
conservation  law  form  of  (1)  can  be  maintained  as 
shown,  for  example,  by  Vinokur  (6).  Equation  (1) 
can  be  then  written  as 

3u  3F  3g  ... 

Tt*  3£  *  In  ’  0  (2) 


u  *  t  .  f  »  (f£  +  c£  )/j  ,  g  -  (Fn  + an  )/J 

J  x  Y  X  / 

J  is  the  transformation  Jacobian 

J  .  - i -  .;l  -  in  : 

x.-yn  -  xnFr  x  y  y  x 

The  metrics  £  ,  £  ,  etc.  are  easily  formed  using 
the  relations*  y 


where  |a| ,  [ B |  are  matrices  with  positive  eigen¬ 
values^  related  to  the  Jacobians  A  *  (3?/3u)  and 
B  »  [3C/3U),  (A V&£)  and  (A+/An)  are  one-sided  for¬ 
ward  differences  and  (A“/A£)  and  (A*/An)  are  one¬ 
sided  backward  differences. 

A 

The  Jacobians  A  and  B  are  related  to  the 
Jacobians  A  »  3F/3U  and  B  =  3G/3U  by 


A  -  A£  +  3£ 
x  y 


b  *  An  ♦  Bn 
x  y 


and  are  given  in,  for  example,  Steger  [7] • 

A  » 


0 

^x 

r 

’y 

0  1 

-uo.  +  £  3a 

t  X 

ur  -  (8-i)£  u 

£yU  -  3£xv 

8Cx 

vu.  ♦  £  Sa 
^  V 

-3£yu  +  £xv 

Or*  (8-1)  £yv 

3Cy 

.  A41 

A42 

A4  3 

(3+UU. 

’J 

t  2  1 

la  .  . 


A4,  .  B5[«<B-1J  -  Tl.  A42  .  -3U.U+  lT  +d,gx  , 


and  A._  *  -BU-v  4k.  . 
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here  a  *  y(u2wv2)  ,  3  =  (Y-l) ,  a=»V(?/D),  Ax=3a/5x, 
%  t  *  (3£/3y)  and  'J-,  are  the  ccntravariant 
velocities 


a.  *  ut  +  vt 

-  X  V 


U  =»  un  +  vh 
n  x  •/ 

where  h  =  3n/3x  and  n  =  3n/3v.  The  Jacobian  natrii 
„  x  y 

3  is  obtained  by  substituting  "J_,  c  ,  A  by  U  , 
h  ,  H  .  g  x  y  n 

x  y 

The  integration  scheme  (51  can  be  much  simpli¬ 
fied  by  diagonalizing  A  and  3.  Once  the  eigen¬ 
values  of  A  and  B  are  known,  it  is  possible  to 
express  A  and  B  in  the  form 


A  ,  A  —X 

S-A.  S- 


3u  Jv  T 

0  ' 


If  now  P = k  A + k  3  for  the  conservative  sy3ten 
-  -1  ^  2 

then  P  =  11  ~?M  gives  the  prcter  relation  for  con¬ 
version  of  ?  from  conservative  to  r.cncor.i creative. 
Both  P  and  ?  have  the  same  eigenvalues,  so  that 


P  -  MT  A(MT)  (10 

Equation  (10)^  can  be  used  to  find  the  matrices  S-, 

A.,  sf'and  3  .  A_,  S -1  in  (3).  One  can,  for 
At  n  3  0 

example,  set  the  arbitrary  k^  and  k2  to  k,  = Ax< 

k,  =■  t  and  obtain 
2  Y 


9  =  S_A_  S. 


P=tA  +  t3  =  A 


where  A  and  A  are  diagonal  matrices  consisting  of 

the  eigenvalues  of  A,  A  ,  ...  A  ,  and  3,  A  , 

A  t  -w  A » 4  3  f  I 

...  A„  >f  resoectively.  The  vectors  3.  and  S  are 
8,4  a^ 

constructed  using  the  eigenvectors  of  A  and  B, 
respectively,  as  columns. 


The  comcanson  of  (10)  and  (3!  leads  to 


Sr  can  be  found  by  carrying  out  the  multiplication 


ill  (11)  .  The  result  is 


Warming  et  al  [8]  found  the  eigenvalues  and 
eigenvectors  of  the  nonconservative  system  corres¬ 
ponding  to  (1) 

30  -  35  -  3(3  „ 

3?*  A3T*S37  ”  ° 

They  give  the  eigenvector  matrix  T  such  that  if 
?  »  k,S  +  k2§,  then 


for  arbitrary  k^,  k2-  The  matrix  T  is: 
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where  c^  «  +  5^2  .  Sr  ^  is  found  simply  from 

-  -1  -l  -l 
S.  -T  Mc 
A  A 


[°  ®  3-  »- 


The  Jacobians  A  and  3  of  the  conservative  system 
(1)  are  obtained  from  the  ncnconservacive  A  and  3 
by 


X  ■  M*1  AM 
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(13! 

A 

is  formed  from  eigenvalues  of  A. 
by,  for  example,  Steger  [7]  : 
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(14) 


S  ,  . \ _  and  S  1  are  similar;  they  can  be  found  by 

n  o  n 

replacing  £  ,  £  and  Ur  by  n  ,  H  and  U  .  Following 
x  y  ^  x  y  H 

closely  the  MacCormack  approach  in  [51 ,  the  matrices 
(a|  and  (b|  in  (5)  are  formed  by  replacing  the 

matrices  and  by  positively  valued  diagonal 

matrices  D.  and  D_  such  that 

A  B 
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fA+Fn  .  A*G  n  .1 

1)  let  U.n  .»  -At  — ■>- 2  +  — ; — ti]  at  every  point 

i,3  (  A£  -n  J 

inside  the  calcuiational  domain. 


2)  solve 


At 


I  -  At 


A+Ia!  |  -p  n+T 


,  iU."  *  AU.‘  .  for 

An  )  i,j  i.j 


/v  1 

OU  '  .  .  This  is  done  in  two  steps  bv  denoting 
1/3  +  ^  _ 

5u.*  . »  fl  -  At  A  J  B I }  50  .n*}  ,  resulting  in  two 
1.2  l  An  j  1.3 

vector  equations. 
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b) 


At 
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(x-4t  <1*^45. 
^1  -  At 
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i.3 
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1.3 


A  closer  look  at  step  2a  reveals  that  it  is  an 
upper  bi-diagonal  equation  that  can  be  solved  by 
sweeping  in  decreasing  i  direction  for  a  constant 
n.  Substituting  (8)  in  step  2a  and  some  simple 
manipulation  leads  to 


I  +  At  5r  D,  s;1  5U*  . 

£i  -i  Ai  i  J  l'j 

i»J  1/J 


^i.j^^i+lWl.j 


(16) 


AU  .n  .  +  At  I A  |  •  5U  .  . 

1,3  i+1, 3  1+1,3 


Denoting  w 
some  matrix  multiplication  gives  finally 


and 


5u.*  .«  S.(I  +  AtD,)_1  S.-1w 
1,3  A  At 


(17) 


Equation  (17)  can  be  very  easily  solved  since 

S_  and  S.“^  are  known  and  the  inversion  of  the 

diagonal1*  matrix  (1+AtD^)  is  trivial.  After  all 

5u*  .  inside  the  calcuiational  domain  are  determined 

the' 3 step  2b  can  be  carried  out  in  the  same  manner. 

The  corrector  steps  are  analogous  to  the  predictor 

steps.  The  major  problem  in  the  above  described 

scheme  is  finding  the  proper  boundary  values  of 

the  expression  Atjal.  .50.  for  i  *  i  ,  i  =*  1 
i r j  l »  J  max 

and  3*3  and  j  »  1,  because  these  are  not  known 
at  the  timexwhen  the  sweep  begins.  They  will  be 
dealt  with  in  the  next  section. 


and 


Boundary  Conditions 
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Formula  (15)  for  D  and  03  assunes  that  viscous 
effects  are  modeled  in  the  implicit  part  of  the 
scheme  by  addition  of  the  terms  A  ,  \g  which  inciu/e 
viscosity  through  the  coefficient  v.  The  elements 
of  D^,  Dg  are  non-negative:  negative  values  of  the 
elements  of  D  ,  Dg  mean  that  the  CFL  condition  is 
met  and  there "is  no  need  to  use  the  implicit  portion 
of  the  scheme  (5).  The  integration  scheme  (5)  can 
now  be  carried  out  in  the  following  3teps: 


Since  the  present  method  uses  upwind  spatial 
derivatives,  a^starting  value  of  the  expression 
At|8|5u  or  At | A | 5u  is  needed  for  both  the  predictor 
and  corrector.  For  simplicity,  we  will  devote  these 
expressions  5W;  they  represent  the  implicit  pact  of 
the  boundary  conditions.  The  explicit  boundary  con¬ 
ditions,  needed  for  evaluation  of  the  expression 
AUn,  are  obtainable  more  easily.  At  the  boundary, 
equation  (16)  will  take  the  following  form: 

(I  +  AtS.  D,  seT1)5uT  .  *  Aun  .+  5W  (18) 

5i.j  \.j  5  1,3  1,3 
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In  order  to  maintain  the  unconditional 
stability  of  scheme  (5) ,  the  value  of  6w  would  have 
to  be  evaluated  implicitly.  An  improper  treatment 
of  the  implicit  part  of  the  boundary  conditions, 
such  as  lagging  in  time,  will  limit  the  stability 
region  of  (5)  to  CFL  members  of  0  (1).  The  present 
work  developed  some  ad  hoc  procedures  for  deter¬ 
mining  the  value  of  <5w  that  gave  satisfactory  re¬ 
sults  for  the  test  cases  used.  A  more  vigorous 
treatment  of  the  boundary  conditions  for  the  pre¬ 
sent  method  will  be  required  to  develop  their 
generally  valid  formulation. 


1  0  0  o' 

0-100 
R  0  0-10. 

0  0  0  1. 

There  still  remains  the  problem  of  finding  the 

predictor  value  of  W.  The  present  work  used  the 

following  procedure: 

SWn+1  =  R  •  it  (  |  B  |  60).n  -  6wn 

'  1  ]-2 


Following  boundary  conditions  were  implemented: 

a)  Supersonic  inflow  boundary:  At  these 
boundaries,  all  the  eigenvalues  of  A  are  positive, 
so  that  all  characteristics  point  from  outside  into 
the  computational  domain.  All  the  elements  of  U 
are  therefore  specified  at  this  boundary.  For 
time-independent  boundary  conditions  one  obtains: 

0,  pu,  pv,  e  »  constant;  6 W  =  0. 

b)  Supersonic  outflow  boundary:  Here  all 
the  characteristics  have  the  direction  form  inside 
to  outside.  All  the  values  of  U  must  be  therefore 
extrapolated  from  the  computational  domain.  The 
explicit  part  of  the  boundary  conditions  does  not 
represent  any  problems;  the  elements  of  U  are  in 
this  case  linearly  extrapolated  from  the 
computational  domain: 


0  »  20 
i  i  -1 

max  max 


i 

max 


(19) 


while  switching  the  direction  of  the  predictor 
sweep  after  every  completed  predictor-corrector 
sequence.  This  is  equivalent  to  lagging  the 
boundary  value  of  change  of  flux  one  half  time  step. 


d)  Periodic  boundaries:  The  periodic 
boundaries  were  again  placed  between  the  first  and 
second  grid  point,  and  6w  was  lagged  one  half  step. 
Placing  the  periodic  boundaries  at,  for  example, 
i  «  1  +  (1/2)  and  i  «  i  -(1/2)  resulted  in  the 
following  scheme:  For  prflictor  sweeping  in 
direction  of  increasing  i. 
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The  same  extrapolation  was  applied  to  6w. 

<5W  *  2At  (|a|60)  -  U  (20) 

"max  "  "max 

Since,  at  the  time  of  evaluation  of  Sw,  the  ex¬ 
pressions  on  the  left-hand  side  of  (20)  are  not 
known,  6w  would  have  to  be  calculated  by  using  an 

implicit  scheme  at  the  three  points  i  ,  i  _i , 
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In  the  present  work  the  CFL  number  in  the 
x-direction  was  always  less  than  1,  so  that  purely 
explicit  boundary  conditions  were  used,  giving  (19) 
and  5w  *  0. 

c)  Solid  wall  boundary:  The  wall  was  placed 
between  the  first  and  second  grid  point  and  re¬ 
flective  boundary  conditions  were  used.  The  ex¬ 
plicit  boundary  conditions  for  an  adiabatic  wall  are 
then: 
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The  corrector  value  of  5 w  can  be  obtained 
using  the  same  principle  from  the  predictor  value  of 
AtlsfiO  at  the  point  j  *  2 


6w"+1 

"max, j 


6w 
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2,J 


Here,  the  grid  lines  are  assumed  to  be  orthogonal 
at  these  boundaries. 


No  attempt  was  made  at  this  time  to  create 
nonreflective  implicit  boundary  conditions  based  on 
the  characteristics  of  (5) .  Note  also  that  the 
outflow  boundary  conditions  are  only  explicit.  The 
present  method  is  therefore  limited  to  supersonic  in¬ 
flow  and  supersonic  outflow  boundaries  with  CFL  number 
0(1)  in  the  outflow  direction  or  periodic  inflow/ 
outflow. 

Sxamples 

The  present  numerical  method  was  tested  on  two 
examples,  the  Couette  flow  and  a  supersonic 
diffuser. 


it*"*1  «  R  •  At( ; B  I  6u> n*1  (21) 

1-2 

where 


A)  Couette  flow:  The  physical  dcnain  consisted 
of  stationary  lower  wall  at  j  =  1,  uniformly  moving 
upper  wall  at  j  -  16  and  two  periodic  boundaries  at 
i-1  +  (1/2)  and  i  *  10  +  (1/2).  The  computational 
domain  had  16  grid  lines  in  the  C-direction  and  11 
grid  lines  in  the  ^-direction.  Several  cases  were 
run  for  Reynolds  numbers  based  on  the  distance  be¬ 
tween  the  upper  and  lower  wall  between  6.2  and 
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2.3  x  10  ac  Mach  numbers  between  0.09  and  0.75. 
Courar.t  numbers  in  0-direction  of  up  to  1000  did 
not  cause  any  problems,  but  the  Courant  .number  in 
5-direction  was  limited  to  0(1)  by  the  periodic 
boundary  conditions.  Depending  on  the  Reynolds 
number  and  the  Courant  number,  the  steady  state 
solution  was  reached  after  10  to  500  iterations. 

The  results  for  U  =  0.75  are  shown  in  Fig.  1. 
up 

B)  Supersonic  diffuser:  The  supersonic 
diffuser  flow  calculation  was  performed  to  test 
the  present  method  on  a  more  demanding  case  for 
which  analytical  and  experimental  data  are 
available.  The  diffuser  had  the  shape  shown  in 
Fig.  2.  It  has  a  straight  lower  wall  and  upper 
wall  with  a  compression  corner  to  produce  a  shock 
of  required  strength.  The  orthogonal  grid  shown 
in  Fig.  2  was  numerically  generated.  There  were 
51  grid  lines  in  n-direction  and  51  grid  lines  in 
5-direction,  with  20  grid  lines  in  the  viscous 
layer  region  close  to  the  wall,  upstream,  the  Mach 
number  was  2,  and  the  velocity  distribution  corres¬ 
ponded  to  Re  *  1.25  x  104.  The  pressure  ratio 
across  the  sSock  was  ?,/,  ■  1.2;  it  was  chosen  such 
thac  direct  comparison'with  experimental  results 
(9]  was  possible.  The  boundary  layer  was  assumed 
to  be  laminar.  At  the  ^reflection  point  the  Reynolds 
number  was  Re  »  3  x  10  ;  the  pressure  increase  due 

X 

to  the  shock  caused  boundary  layer  separation.  The 
resulting  pressure  profiles  after  600  iterations  at 
CFL  number  Cf  "  160  agree  rather  well  with  ex¬ 
perimental  data  from  Hakkinen  et  al  [9]  (see 
Fig.  4)..  The  pressures  at  the  walls  display 
slightly  higher  values  than  the  experiment  due  to 
the  blockage  effect  of  the  boundary  layer.  The  cf 
coefficient  in  Fig.  4  shows  again  good  agreement 
with  analytical  data  by  Bush  (10]  and  experimental 
data  which  were  performed  on  a  flat  plate. 

Some  problems  were  experienced  with  the 
boundary  conditions  in  n-direction.  Best  results 
were  obtained  by  switching  sweep  directions  after 
one  complete  time  step.  The  corrector  value  of  5w 
was  reflected  off  the  opposite  wall.  Although  the 
method  has  natural  dissipation,  the  steep  gradient 
at  the  shock  boundary  layer  interaction  region 
caused  sometimes  stability  problems.  A  weak  fourth 
order  explicit  damping  term  was  added  to  the 
right-handed  side  term,  eliminating  the  instability. 

Conclusion 

The  present  method  offers  substantial  potential 
for  use  in  complex  compressible  viscous  flow  cal¬ 
culations.  Using  the  same  Courant  numbers,  it  is 
faster  and  simpler  than  existing  implicit  methods 
because  it  does  not  require  inversion  of  block  tri¬ 
diagonal  matrices.  At  the  present  time  its  use  at 
higher  Courant  numbers  is  limited  by  the  choice  of 
boundary  conditions  to  supersonic  flows.  Its 
general  usefulness  for  transonic  flows  depends  on 
future  research  in  the  area  of  boundary  conditions. 

The  implicit  part  of  the  boundary  conditions 
deserves  special  attention  as  well  as  the  formulation 
of  non-ref lective  boundary  conditions  for  the  inflow 
and  outflow  boundaries. 

The  5  *  constant  boundaries  (upstream  and 
downstream)  did  not  represent  a  problem  in  this 
case,  because  the  CFL  number  wa3  here  0(1). 
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Figure  1.  Velocity  and  pressure  profiles  for  Couette 
flow  at  different  x  locations. 
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IMPLICIT  MacCORMACK  SCHEME 


Summary 

An  extremely  interesting  implicit  scheme  for  viscous  flow  calculations 
was  suggested  by  MacCormack.  This  scheme  was  generalized  for  arbitrary  two- 
dimensional  geometries  and  examined  for  its  use  in  turbomachinery  type 
geometries.  The  scheme  was  found  to  be  potentially  useful  for  supersonic 
inlet  analysis  but  generally  not  useful  for  turbomachinery  analysis.  A 
copy  of  an  AIAA  paper  presented  at  the  20th  Aerospace  Sciences  meeting  is 
enclosed  as  part  of  this  report. 
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COMPUTATIONAL  PLAN 

As  a  program  management  aid,  a  three-year  computational  plan  has 
been  created  and  is  outlined  in  figures  (1-3) .  Figure  (1)  attempts  to 
outline  logical  interrelationships  between  development  work  on  the  two 
and  three  dimensional  viscous  codes,  specific  test  examples  to  be  cal¬ 
culated,  specific  problem  areas  to  be  examined  and  general  project  ob¬ 
jectives.  It  can  be  seen  that  the  two-dimensional  viscous  code  is  pro¬ 
jected  to  be  used  both  as  a  development  and  test  tcol  for  the  three- 
dimensional  code  and  to  study  important  flow  problems  such  as  base 
pressure  prediction  and  film  cooling  heat  transfer  prediction.  Figures 
2  and  3  are  projected  time  lines  for  the  code  development  and  checkout. 
The  test  examples  chosen  are  expected  to  be  modified  through  interaction 
with  the  sponsors  and  by  events. 


LOGIC  DIAGRAM  FOR  VISCOUS  CODE  DEVELOPMENT 

VISCOUS  3-0  VISCOUS  2-0 

TEST  PROBLEMS  DEVELOPMENT  COMMON  PROBLEMS  DEVELOPMENT  TEST  PROBLEMS 
FOR  3-0  COOES  STEPS  AND  OBJECTIVES  STEPS  FOR  2-0  COOES 


PROJECTED  TIMELINE  l-OR  VISCOUS 


linpiuvc  Ai'/uti  1  liiii  Accuracy 
NpccJ.  tills  I-’ lux  IKi lance 
MIS  Numeric:; 


L’oiivo r L  1'roiiiMit  3-D  Euler 


Figure  3 


101 


TWO  DIMENSIONAL  VISCOUS  CODE  STATUS 
Summary 

The  status  of  the  present  two-dimensional  viscous  code  development 
is  illustrated  through  a  series  of  calculations  for  the  RR  T7  cascade 
at  design  incidence.  The  first  inviscid  calculation  illustrates  that 
the  problems  of  global  mass  and  momentum  conservation  associated  with 
time  marching  codes  have  been  solved  and  that  adequate  blunt  leading  edge 
resolution  can  be  obtained  with  sheared  grid  systems.  Analysis  of  such 
solutions  has  shown  that  the  artificial  viscosity  terms  used  to  control 
algorithm  stability  are  themselves  an  error  source  when  grid  spacings 
change  rapidly.  Improved  damping  term  formulations  are  the  next  major 
focus  for  two-dimensional  code  development.  A  second  set  of  inviscid 
and  laminar  viscous  solutions  shows  the  extreme  sensitivity  of  turbine 
cascade  calculations  to  trailing  edge  geometry.  Since  our  intent  is  not 
to  develop  an  inviscid  flow  code,  inviscid  trailing  edge  model  and  in¬ 
viscid  boundary  condition  improvements  will  be  directed  by  Dr.  Norton 
of  R-R. 

Inviscid  Flow  Calculations 

A  set  of  test  calculations  for  the  RR  T7  cascade  geometry  were  con¬ 
ducted  in  order  to  evaluate  general  predictive  capabilities  of  the 
present  code  and  grid  systems.  The  test  calculations  were  all  conducted 
at  design  incidence.  The  first  grid  system  chosen  is  shown  in  figure  1 
and  uses  69  axial  points  and  30  tangential  points.  A  rather  blunt  trailing 
edge  with  coarse  grid  resolution  was  used.  Figure  2  shows  the  comparison 
between  predicted  and  measured  surface  Mach  numbers.  Reasonable  agreement 
exists  on  the  pressure  surface.  Leading  edge  resolution  appears  adequate 


but  rapid  changes  near  the  leading  edge  appear  to  be  caused  by  the 
artificial  damping  algorithm.  Present  damping  algorithms  evolved  for 
grid  systems  of  rather  uniform  density.  When  used  with  the  rapidly 
varying  grids  of  figure  1,  this  algorithm  is  inadequate.  A  stagnation 
pressure  error  contour  plot  for  this  solution  is  shown  in  figure  3. 

Important  pressure  errors  are  generated  at  the  trailing  edge,  but  the 
solution  is  generally  adequate  in  this  respect.  Figures  4  and  5  are 
plots  of  the  mass  averaged  flow  angle  and  mass  flow  rate  against  axial 
distance.  Conservation  of  these  quantities  is  excellent. 

In  order  to  improve  leading  and  trailing  edge  resolution  a  new 
grid,  shown  in  figure  6,  was  constructed.  This  grid  has  89  axial  points 
and  50  tangential  points.  A  "sharp"  trailing  was  modeled.  The  surface 
Mach  number  comparison  for  this  grid  is  shown  in  figure  7  and  can  be 
seen  to  be  very  poor.  Suction  surface  velocities  and  blade  lift  are 
very  low  as  is  the  suction  surface  trailing  edge  velocity.  This  result 
is  quite  surprising  unless  one  realizes  that  the  blade  lift  (circulation) 
is  controlled  by  the  trailing  edge  model,  and  the  "sharp"  trailing  edge 
does  not  produce  a  good  flow  model. 

Viscous  Flow  Calculation 

In  order  to  test  the  importance  of  trailing  edge  modeling  a  laminar, 
viscous,  design  incidence  solution  was  computed.  The  grid,  shown  in 
figure  8,  has  100  axial  points  and  50  tangential  points.  Since  the  object 
was  to  test  trailing  edge  models,  no  attempt  to  generate  a  good  viscous 
grid  was  made.  Instead,  the  same  leading  and  trailing  edge  axial  spacing 
was  retained,  and  10  points  added  near  each  blade  surface.  Computed  surface 


pressures  are  shown  in  figure  9  and  can  be  seen  to  compare  much  more 
favorably  with  the  measured  pressures  than  do  the  second  inviscid  test 
case.  The  same  type  of  leading  edge  pressure  oscillations  can  also  be 
seen  in  this  solution. 

Implication  for  Viscous  Code  Development 

The  proper  conclusion  to  be  drawn  from  these  comparisons  is  not 
that  viscous  effects  are  important  for  the  T7  at  design  incidence,  but 
rather  that  trailing  edge  modeling  is  important  for  inviscid  calculations 
Since  our  objective  is  not  really  to  develop  inviscid  code  predictions, 
trailing  edge  models  and  boundary  condition  procedures  for  inviscid 
calculation  will  be  directed  by  Dr.  Norton  of  RR.  Since  boundary  layer 
calculations  and  turbulence  modeling  are  also  being  followed  by 
Dr.  Norton,  the  possible  use  of  these  codes  for  inviscid  predictions 
will  be  adequately  explored. 

These  examples  demonstrated  persistent  leading  edge  oscillations 
which  are  felt  to  be  not  really  due  to  blunt  leading  edges  but  an 
artifact  of  the  artificial  damping  algorithm  when  used  with  rapidly 
varying  grid  mesh  spacings.  This  problem  and  the  generation  of  more 
realistic  viscous  grids  are  to  be  the  next  areas  of  .work  for  the  two- 


dimensional  code. 
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FIRST  ANNUAL  RRI/ONR  REPORT 


EXPERIMENTAL  EFFORTS 

Introduction 

This  report  summarizes  the  experimental  activities  during  the 
first  year  of  the  Investigations  of  Flow  Fields  and  Heat  Transfer  in 
Modern  Gas  Turbines.  Activities  during  this  period  consisted  primarily 
of  the  conceptual  and  preliminary  mechanical  design  of  the  Blowdown 
Turbine  Facility. 

The  goals  of  the  program  are  summarized  in  Figure  1.  The 
emphasis  is  on  developing  the  capability  to  simultaneously  measure  the 
heat  transfer  and  aerodynamics  of  a  full  size  film  cooled  high  pressure 
turbine  under  rigorously  simulated  conditions.  Thus,  all  the 
nondimens ional  force  and  energy  ratios  (Reynolds  No.,  Mach  No.,  Rossby 
No.,  Prandtl  No.,  Eckert  No.)  will  be  kept  the  same  as  for  the  full 
scale  turbine.  This  must  be  accomplished,  however,  at  a  cost  level, 
both  for  construction  and  operation,  consistent  with  a  University 
operation . 

Scaling 

The  principal  scaling  is  one  of  temperature.  The  ten^erature 
ratios,  main  flow  gas  to  metal  and  coolant  flow  to  metal,  are  kept  the 
same  as  for  the  full  scale  turbine  but  the  absolute  temperature  levels 
are  reduced.  In  this  case,  the  metal  temperature  is  scaled  to  room 
temperature.  The  resultant  gas  temperatures  are  478 °K  for  the  main  flow 
and  2 12°K  for  the  coolant. 

Another  variable  in  the  scaling  is  that  of  gas  composition.  Air  is 


a  possible  choice  but  has  a  ratio  of  specific  heats,  y  *  1.38  at  478°K 
instead  of  the  y  =  1.28  typical  of  high  temperature  turbines.  By  mixing 
light  and  heavy  gases  we  can  adjust  y,  both  to  simulate  the  full  scale  and 
to  investigate  the  importance  of  y  matching  for  aerodynamics  &  heat 
transfer  testing.  Figure  2  lists  the  properties  of  several 
argon-refrigerant  mixtures  while  Table  3  lists  the  inlet  conditions 
required  for  similarity  with  the  full  scale  turbine.  Note  that  not  only 
does  the  gas  mixtures  match  y,  but  the  Reynolds  No.  is  matched  at  a  much 
lower  pressure,  4-5  atm. ,  implying  considerable  savings  in  tanks,  piping, 
etc.  as  well  as  in  the  amount  of  test  gas  required.  Of  the  refrigerants 
investigated,  R-12  (Freon-12)  is  by  far  the  least  expensive  ($l/lb)  and 
therefore  the  gas  of  choice  for  the  main  flow.  The  coolant  temperature  is 
low  enough  to  condense  R-12  at  the  max  pressure  considered,  10  atm,  so 
that  R-14  would  be  used  for  the  coolant.  Other  advantages  of  the  gas 
mixture  stem  from  the  25%  reduction  in  the  speed  of  sound  compared  to  that 
in  air  at  the  same  temperature.  Thus,  for  a  given  tip  Mach  No.,  the 
rotational  speed  is  only  3/4  as  great-reducing  stresses  by  a  factor  of  2. 
The  blade  passing  frequency  is  similarly  reduced,  decreasing  the  frequency 
response  required  of  the  fixed  frame  instrumentation  for  a  particular 
spatial  resolution  in  the  rotor  frame. 

Figure  4  illustrates  the  Blowdown  Turbine  scaling.  Note  that  the 
tenqperature  ratios  (the  Eckert  No.)  establishes  the  gas  temperatures; 
the  Reynolds  number  establishes  the  inlet  pressure;  the  ratio  of 
specific  heats  establishes  the  gas  composition;  and  the  tip  Mach  No. 
sets  the  rotational  speed.  The  Prandtl  No.  cannot  be  independently  set 
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but  fortuitously  is  essentially  exactly  correct.  The  reduced  absolute 
pressure  levels  and  the  high  molecular  weight  of  the  mixture  combine  to 
reduce  the  turbine  power  produced  by  a  factor  of  approximately  20. 

At  this  point,  the  turbine  size  must  be  selected.  Basically,  as 
large  a  turbine  as  possible  is  desirable  from  the  point  of  view  of 
instrumentation.  In  fact,  even  the  largest  production  high  pressure 
turbines  typically  have  blade  spans  of  only  6  cm.  The  facility  size  scales 
more  closely  with  mass  flow  than  blade  length,  however.  The  mass  flow 
required  for  a  turbine  from  a  50,000  lb  thrust  engine  was  beyond  the 
resources  of  this  program  and  a  0.5  m  (20  inch)  diameter  turbine  size  was 
chosen.  This  was  consistent  with  the  resources  in  hand,  comparable  in  size 
to  the  NASA  High  Pressure  facility,  and  has  a  span  2/3  of  that  of  the 
largest  turbines. 

Steady  state  turbine  performance  measurement  is  not  a  strong 
driver  since  the  goals  of  this  program  are  primarily  concerned  with  time 
resolved  measurements.  Therefore,  the  idea  of  building  the  facility 
around  an  existing  conventionally  tested  turbine  had  considerable 
attractiveness.  It  reduced  the  program  costs  and,  even  more 
importantly ,  provides  a  baseline  performance  against  which  to  compare 
the  blowdown  test  results.  The  turbine  selected  is  of  Rolls  Royce 
Limited  design  and  manufacturer.  It  represents  a  high  work  design  of 
the  mid  to  late  1970's  and  has  a  pressure  ratio  of  approximately  4  to  1. 

The  full  scale  column  in  Figure  4  refers  to  this  turbine. 

For  purposes  of  future  research  capabilities  the  facility  was  sized 
for  twice  full  scale  Reynolds  No.  testing  (10  atmos.  laboratory  inlet  total 


pressure)  and  a  laboratory  inlet  temperature  of  530SK  (SOO^F),  to  permit 
carbon  steel  construction  and  the  use  of  elastomer  seals.  This  results  in 


a  facility  capable  of  simulating  full  scale  conditions  up  to  40  atmos.  at 
2500®K  (4500*R)  as  shown  in  Figure  5. 

Facility  Configuration 

Now  that  the  physical  scaling  is  established,  the  facility 
configuration  must  be  selected.  Many  candidate  configurations  were 
examined  with  four  in  some  detail.  These  are  summarized  in  Figure  6. 

All  of  these  designs  are  based  on  transient  testing.  This  is  both  to 
reduce  the  costs  associated  with  the  experiment  and  to  take 
advantage  of  many  of  the  transient  heat  transfer  and  fluid  dynamic 
testing  techniques  developed  over  the  last  25  years.  Also,  since  the 
predominant  source  of  unsteadiness  in  turbines  is  rotor-stator 
interactions,  a  test  time  on  the  order  of  0.1  to  1.0  sec.  (i.e.  600  to 
6000  blade  passings)  should  be  sufficient  for  most  studies.  This  has 
proved  true  in  several  transient  cascade,  turbine,  and  compressor 
facilities  at  MIT  and  around  the  world. 

One  configuration  studied  used  room  temperature  high 
pressure  gas  storage  (100  to  200  atmos.)  to  feed  a  flow  heater  and  then 
the  turbine.  A  large  flywheel  would  absorb  the  turbine  power  (Figure  5, 
#1).  This  has  the  advantage  of  maintaining  constant  inlet  pressure 
during  the  test  time  with  fast  acting  pressure  regulators.  The 
principal  disadvantage  is  that  the  high  pressure  storage  precludes  the 
use  of  a  heavy  gas  mixture  (the  heavy  gases  condense  at  high  pressure). 


The  second  configuration  is  a  variation  on  the  first  in  which 
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the  high  pressure  storage  is  replaced  by  a  room  pressure  expulsion 
bladder  contained  in  a  heated  shell.  Since,  the  turbine  inlet  pressure 
is  constant  at  room  pressure.  Freon  can  be  used.  This  scheme 
has  the  disadvantage  that  full  Reynolds  number  similarly  cannot  be 
achieved.  Also,  the  technology  required  for  a  100  m^  500* K  balloon  is 
risky. 

The  third  configuration  considered  was  the  original  concept  which 
motivated  this  project.  In  this  scheme  the  turbine  is  directly  coupled 
to  a  compressor  whose  inlet  is  the  turbine's  outlet.  Thus,  the  turbine 
and  compressor  turn  at  the  same  mechanical  speed  and  share  the  same 
flowpath.  Since  a  choked  turbine  produces  power  proportional  to  the  tip 
Mach  No.  squared  and  a  centrifugal  compressor  does  work  as  the  square  of 
the  tip  Mach  No. ,  the  turbine-compressor  pair  are  matched  over  a 
considerable  range  and  operate  at  constant  corrected  speed  without  the 
need  of  a  control  system.  The  corrected  weight  flow  is  constant  so  long 
as  any  orifice  in  the  flow  path  remains  choked.  If  we  now  place  the 
turbine-compressor  test  section  between  two  tanks,  the  upstream  supply 
tank  filled  with  the  gas  mixture  at  appropriate  conditions  and  the 
downstream  dump  tank  evacuated  to  full  vacuum,  the  facility  is  complete 
with  the  addition  of  a  valve  or  diaphragm  between  the  supply  tank  and 
test  section. 

The  design  of  such  a  blowdown  tunnel  was  carried  out  in  some 
detail.  A  compressor  was  chosen  in  this  design  over  a  flywheel  for  two 
reasons.  The  first  is  that  the  compressor  can  maintain  a  closer  match 
in  corrected  speed  compared  to  the  flywheel  since  with  the  flywheel,  the 


turbine  mechanical  speed  must  always  increase  (if  only  a  small  amount), 
while  with  the  compressor,  the  turbine  speed  can  slow  to  match  the  drop 
in  inlet  temperature  that  is  inherent  to  a  blowdown  from  a  fixed  volume 
tank.  Thus,  the  flywheel  may  be  appropriate  for  a  constant  inlet 
condition  scheme  such  as  Figure  5,  Nos.  1  &  2,  but  an  energy  absorber  is 
really  required  for  a  blowdown.  The  compressor  design  proved  to  be  a 
problem,  however.  The  problem  with  the  centrifugal  compressor  is  that 
the  very  high  work  output  of  the  transonic  turbine  requires  a  very  large 
diameter  compressor  wheel  which  in  itself  acts  as  a  large  flywheel.  As 
the  compressor  diameter  passed  1  meter  in  the  design,  it  was  clear  that 
a  safe  mechanical  design  of  the  rotating  system  would  be  very  difficult. 
(Note  that  the  aerodynamic  design  is  not  so  difficult  since  the 
compressor  efficiency  does  not  effect  its  power  absorption,  only  the 
pressure  rise.  The  pressure  rise  does,  however,  determine  the  volume  of 
the  dump  tank  required.  The  principle  aerodynamic  design 
consideration  for  the  compressor  is  that  the  weight  flow  vs.  speed 
characteristic  match  that  of  the  turbine). 

Various  compressor  designs  were  then  attempted  including  a 
supersonic  axial  compressor  and  the  adaptation  of  a  multistage  aircraft 
engine  high  compressor.  These  schemes  did  not  seem  to  offer  much  in 
reducing  either  the  mechanical  complexity  or  the  technical  risk  of  the 
project. 

Another  configuration  investigated  replaced  the  compressor  with 
an  eddy  current  brake.  An  eddy  brake  consists  of  a  moving  conductor 
(metal  disk  or  drum)  in  an  imposed  magnetic  field.  This  induces 
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secondary  currents  (eddy  currents)  in  the  conductor  which  opposes  the 
applied  field,  thus  generating  a  braking  force.  The  power  is 
dissipated  as  heat  in  the  conductor.  The  eddy  brake  is  attractive 
because  the  braking  torque  is  proportional  to  the  square  of  the 
rotational  speed.  Thus,  an  eddy  brake  coupled  to  the  turbine  will,  as 
the  compressor,  match  the  turbine  characteristic  without  the  need  of  a 
control  system.  The  eddy  brake  offered  the  advantage  of  a  high  power 
density  design-simplifying  packaging  and  rig  layout.  Unlike  a 
mechanical  friction  brake,  the  eddy  brake  has  no  moving  parts  other 
than  the  conductor  and  thus  relatively  few  mechanical  design  problems. 

Due  to  the  above  potential  advantages  of  the  eddy  current  brake 
and  also  due  to  the  investigators  expertise  in  compressor  design  and 
ignorance  of  eddy  current  brakes,  the  eddy  brake  configuration  was 
chosen  for  the  Blowdown  Turbine. 

Given  the  selected  configuration,  and  a  0.2  to  0.4  sec.  run  time 
goal,  various  facility  sizes  were  simulated.  Figure  8  illustrates  that 
for  a  typical  eddy  brake  configuration  facility,  the  corrected  speed 
and  turbine  pressure  ratio  remain  constant  to  1%  over  0.4  seconds. 
Subsystem  Detail 
Main  Valve 

A  principle  component  of  this  facility  is  the  main  valve  which 
separates  the  supply  tank  from  the  test  section.  The  valve  must  seal 
vacuum  against  10  atmospheres  pressure  at  500®K  (500°F),  open  fully  in 
50  ms,  and  provide  a  smooth  disturbance  free  inflow  to  the  turbine 
(Figure  9).  The  explosively  ruptured  diaphragm  that  was  used  in  the  MIT 


Blowdown  Compressor  Facility  could  not  be  used  here  because  of  the 


higher  pressure  and  temperature.  The  design  chosen  was  the  pilot 
operated  annular  plug  valve  illustrated  in  Figure  10.  The  valve  is 
constructed  of  steel  and  is  therefore  fairly  massive  (slider  weight  = 

100  Kg) .  It  is  designed  to  operate  in  thermal  equilibrium  and  is 
thus  oil  heated.  Valve  dynamics  depend  upon  a  100  mm  diameter  pilot 
pneumatic  piston  for  the  initial  motion,  with  the  major  part  of  the 
opening  force  coming  from  the  pressure  difference  between  the  outside 
surface  of  the  slider  and  the  inside  damping  chamber  which  is  initially 
evacuated.  As  the  slider  moves,  the  damping  chamber  fills  through  the 
orifice  flow  path  and  then  acts  as  the  damper,  slowing  the  slider  to  a 
stop.  To  aid  in  the  design,  the  valve  dynamics  were  simulated  on  a 
computer .  Typical  predictions  from  the  simulation  are  presented  in 
Figure  11.  The  valve  is  at  its  designated  open  position  (i.e.  valve  flow 
area  is  greater  than  any  downstream  flow  area)  after  less  than  50  ms. 
from  initial  motion  point. 

Eddy  Brake 

The  design  of  the  eddy  brake  has  proven  to  be  a  considerable 
technical  challenge.  An  eddy  brake  torque  vs.  speed  characteristic  is 
shown  in  Figure  12.  The  Blowdown  Turbine  must  operate  on  the  linear 
section  of  the  curve  if  the  power  absorbed  is  to  be  proportional  to  the 
square  of  the  shaft  speed.  As  the  shaft  speed  increases,  the  magnitude 
of  the  eddy  currents  increase  until  at  ug,  the  eddy  field  is  equal 
to  the  applied  field.  The  torque  will  now  decrease  with  increasing 
speed  as  the  eddy  field  excludes  the  applied  field  flux  lines. 


The  challenge  is  to  design  a  brake  with  a  critical  speed,  ug. 


above  the  turbine  operating  range  but  which  is  still  capable  of 
absorbing  the  requisite  power  in  a  reasonable  mechanical  package.  To 
simplify  the  problem,  the  brake  was  designed  to  heat  sink  the  power 
1  produced  (1.2  MW  for  1  sec.)  in  the  moving  conductor.  This  introduced 

an  additional  design  problem  since  at  a  given  mechanical  speed, 
decreases  with  increasing  conductor  thickness.  Thus  the  conductor  must 
^  be  thick  to  absorb  the  power  but  it  must  be  thin  because  of  the 

electrodynamics.  The  solution  adopted  was  to  use  a  very  high 
resistivity,  high  hot  strength  material  in  a  drum  conf iguration  with  10 
I  "horseshoe"  electromagnets  arranged  around  the  periphery.  The  design  is 

summarized  in  Figure  13.  Current  from  a  D.C.  source  is  supplied  to  the 
magnets  as  the  valve  is  open  (the  rotor  is  to  already  have  been  brought 
I  to  operating  speed  by  a  small  D.C.  motor). 

Tanks  and  Auxiliary  Systems 

The  supply  and  dump  tanks  are  of  carbon  steel  construction.  The 
I  supply  tank  volume  is  approximately  14  m3  (365  cu.  ft.)  and  dump  tank 

volume  is  20  m3  (550  cu.  ft.).  The  supply  tank  is  double  walled  so  that 
hot  oil  can  be  circulated  to  heat  the  tank  to  the  desired  operating  point. 
The  valve  is  similarly  heated.  A  silicon  rubber  insert  and  a  water  jacket 
on  the  upstream  flange  keep  the  test  section  at  room  temperature.  A 
commercial  80  KW  electric  oil  system  heats  and  circulates  the  oil.  The 
I  facility  is  illustrated  in  Figure  14. 

Other  auxiliary  systems  include  a  vacuum  pump  for  evacuating  the 
tunnel  and  a  gas  mixture  system  required  to  supply  the  mixture.  The 
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facility  is  designed  with  a  2  to  3  hour  recycle  time. 

Data  Acquisition 

The  data  acquisition  and  analysis  system  for  this  facility  must 
be  capable  of  acquiring  data  with  an  information  bandwidth  of  at  least 
40  kHz  from  20  to  100  sensors  for  at  least  0.5  seconds.  This  implies  a 
prodigious  data  rate  and  volume.  No  commercial  system  was  capable  of 
meeting  these  requirements  at  a  reasonable  price.  The  preliminary 
design  of  a  custom  system  was  performed  and  its  detailed  design  and 
construction  contracted  out.  The  system  is  shown  schematically  in 
Figure  15.  It  consists  of  up  to  100  separate  channels  each  capable  of 
digitizing  at  200,000  san?>les/sec.  There  are  also  up  to  96  lower  speed 
channels  multiplexed  onto  8  high  speed  channels.  The  digital  output 
from  these  channels  is  transmitted  200  ft.  at  an  average  data  rate  of 
256  Mb/sec.  to  a  32  M  byte  semiconductor  memory,  thus  permitting  up  to 
200,000  data  samples  on  each  of  100  channels.  The  data  acquisition 
system  is  under  the  control  of  a  local  microcomputer.  After  the  test  is 
completed,  the  data  is  read  from  the  memory  into  a  Perkii. -Elmer  3242 
computer  for  analysis.  System  specifications  are  summarized  in  Figure 
16.  Initially,  25  high  speed  and  48  low  speed  channels  have  been 
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Instrumentation 


Flowf ield 

One  of  the  principle  advantages  of  the  scaled  blowdown  scheme  is 
that  the  environment  is  quite  benign  when  compared  with  a  full  scale 
turbine.  It  is  in  many  ways  comparable  with  that  of  a  conventional 
compressor  and  thus  most  of  the  standard  compressor  instrumentation 
techniques  can  be  directly  applied  to  the  blowdown  turbine.  The  basic 
high  frequency  technology  is  the  semiconductor  diaphragm  pressure 
transducer  such  as  those  manufactured  by  Kulite,  Entran,  or  Endevco. 
These  are  incorporated  into  various  probe  configurations  with  net 
frequency  response  in  the  10  to  20  kHz  range  (Figure  17). 

Heat  Flux 

Heat  frequency  heat  flux  measurements  on  the  rotor  blades  are 
the  first  goal  of  this  program.  Several  alternate  techniques  have  been 
considered  as  summarized  in  Figure  18.  The  most  promising  technique 
involves  the  measurement  of  the  temperature  difference  across  a  well 
calibrated  insulator  as  shown  in  Figure  19.  Considerable  effort  will 
be  spent  in  developing  these  techniques. 

Conclusions 

A  Blowdown  Turbine  Facility  has  been  designed  to  be  capable  of 
accurate  simulation  of  the  fluid  physics  of  high  pressure  film  cooled 
turbines.  Detailed  design  and  construction  of  this  device  will  proceed 
along  with  development  of  the  required  instrumentation  techniques 


during  the  next  year. 


FIGURE  1 


BLOWDOWN  TURBINE  PRIMARY  GOALS 

FULL  FLUID  PHYSICS  SIMULATION  -  Re,  M,  Pr,  Ro,  A-P 

TIME  RESOLVED  MEASUREMENTS  -  HEAT  TRANSFER 

-  AERODYNAMICS 

-  SIMULTANEOUS  HT  &  AERO 

-  COOLING  AIR  TRANSPORT 

LOW  CONSTRUCTION  COSTS 

LOW  OPERATING  COSTS  -  UNIVERSITY  SCALE 


COMPOSITION  AND  THERMOPHYSICAL  PROPERTIES 
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FIGURE  3 


ARGON- REGRIGE RANT  MIXTURES:  REYNOLDS  SIMILARITY 

TEST  PRESSURES  AND  SATURATION  TEMPERATURES 


MIXTURE  CONDITIONS  AT  REYNOLDS  SIMILARITY 


XR 

P '  /P 

O'  0 

po 

(psia) 

PR 

(psia) 

T  . 
sat 

(°F) 

R-12 

.  2554 

.2229 

64.3 

16.4 

-17 

R-13 

R-13B1 

.  2690 

.2265 

65.3 

17.6 

-65 

R-14 

.2936 

.2637 

76.1 

22.3 

-187 

R-22 

.3384 

.2294 

66.2 

22.4 

-24 

R-115 

.1469 

.2484 

71.7 

10.5 

-50 

FIGURE  4 

HIT  BLOWDOWN  TURBINE  SCALING 


FULL  SCALE 

MIT  BLOWDOWN 

FLUID 

AIR 

AR  -  r12 

RATIO  SPECIFIC  HEATS 

1.27 

1.27 

metal/gas  temp  ratio  tm/tg 

.63 

.63 

MEAN  METAL  TEMP,  T^ 

1113  °i< 

300orK 

INLET  TOTAL  TEMP,  TQ 

1780°K 

478  °K 

COOLING  AIR  TEMP 

790°K 

212  °K 

AIRFOIL  COOLING  AIR  FLOW 

12.5% 

12.5% 

TRUE  NGV  CHORD 

8.0cm 

5.9cm 

REYNOLDS  NUMBER 

2.7xl06 

2.7xl06 

INLET  TOTAL  PRESSURE 

289  psia 

64  psia 

PRANDTL  NUMBER 

.752 

.755 

TEST  TIME 

CONT. 

.2  SEC 

POWER  PRODUCED 


24  MW 


1.3  MW 


COMBUSTION  EXIT  TEMPERATURE 


IMULATION  CAPAB 


IV,  CLOSED  BLOWDOWN  AND  EDDY  CURRENT  BRAKE 


CANDIDATE  FACILITY  CONFIGURATIONS 


FIGURE  7 


POWER  BALANCE 

TURBINE  CONNECTED  TO  EDDY  CURRENT  BRAKE 


TURBINE 


POWER  PRODUCED  ~MJIp2  ^  N2 

EDDY  BRAKE 


POWER  ABSORBED  ~  BQ2  U2^  N2 

CONCLUSIONS 

•  BRAKE  &  TURBINE  CHARACTERISTICS  MATCH 

•  NO  CONTROL  SYSTEM  REQUIRED 

•  MAGNETIC  FIELD  STRENGTH  FINE  TUNES  BRAKE 
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FIGURE  9 


MAIN  FLOW  VALVE 

FACILITY  REQUIREMENTS 

•  NONRESTRICTING  -  OPEN  2  FT,  DIAMETER 

•  FAST  ACTING  -  50mS  FULL  OPEN 

•  HIGH  TEMPERATURE  OPERATION  -  UP  TO  5Q0°F 

•  EXCELLENT  SEALING  -  VACUUM  TO  150  PSIA 

DESIGN  IMPLEMENTATION 

•  ANNULAR  PLUG  DESIGN  CHOSEN 

•  PILOT  OPERATED  -  GAS  FLOW  PROVIDES  OWN  FORCE 

•  CARBON  STEEL  CONSTRUCTION  -  OIL  HEATED 


ALL  SEALS  USE  "9"  RINGS 
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MAIN  VALVE  LAYOUT 
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FIGURE  13 


EDDY  CURRENT  BRAKE  -  DESIGN  SUMMARY 


Operating  Point 

Drum  Material 
Geometry 


Gap  to  Pitch  Ratio 
Resisting  Factor 


co  =  662  rad/sec 

p  =  1.5  x  10^  watts 

INCO  718,  a  -  0.801  x  106  (n-m)-1 
Drum  Radius,  r  =  .16  cm 
Drum  Thickness,  A=  .635  cm 
Total  Air  Gap  Height,  2g  =  1.27  cm 
Number  of  Poles,  N  =  20 
=  0.25  (normal) 


Max .  Torque 
Reynold ' s  Number 


0.7854 


Max .  Torque  Speed 

Power  at  Max. 

Torque 

Static  Field 
Amp.  Turns  per 
Pole 

Excitation  Coils 
Time  Constant 
Total  Power 

"Copper"  cross-section 


cog  =  87  0  rad/sec  (normal) 

PQ  =  2.045  x  10^  watts 
Bq  =  0.781  T 


I  =  7896  amps 


t  (sec)  0.010  0.020  0.025 

c 

P  (watts)  47,000  23,500  18,800 
c 

A  (cm2)  1.60  3.21  4.01 

c 


BLOWDOWN  TURBINE  FACILITY 


FEET 


POWER 

SUPPLY 
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FIGURE  16 

DATA  ACQUISITION-  SYSTEM  COMPARISON 


RESOLUTION/CH 
MAX  SAMPLING  RATE 
APERTURE  UNCERTAINTY 
LINEARITY 
ABSOLUTE  ACCURACY 
SAMPLING  MODE 
GAIN/OFFSET  CONTROL 
CLOCK  GENERATOR 
CLOCK  RESOLUTION 
SAMPLE  SETUP 
MANUAL  READOUT 
TOTAL  SAMPLES/CH 


12  BITS 
200  KHz 
100  ps 
0,02*1%. 

0.1% 

SIMULTANEOUS 
FRONT  PANEL  ' 

4  speed;  20  Hz-200  KHz 

INVERSE 

COMPUTER 

COMPUTER  DISPLAY 

*100,000 


10  BITS 
1000  KHz 
100  ps 
0.2% 

1.0% 

SIMULTANEOUS 

PCB 

3  SPEED;  20  Hz-20MHz 

GEOMETRIC 

SWITCHES 

OSCILLOSCOPE 

1,000 
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FIGURE  17 


PRESSURE-VELOCITY  INSTRUMENTATION 


SENSOR  TECHNOLOGY  -  SILICON  DIAPHRAGM,  STRAIN  GAUGE 

WALL  STATIC  PRESSURE  -  CASING,  NGV'S 

TRAVERSING  PROBE  -  FLOW  TOTAL  l  STATIC  PRESSURE,  2  FLOW 
ANGLES 

-  STATIONARY  FRAME  TRAVERSE 

-  ROTATING  FRAME  TRAVERSE  (BEHIND  ROTOR) 


TECHNIQUES  ADAPTABLE  FROM  COMPRESSOR  RESEARCH 
(SIMILAR  TEMPERATURES  &  PRESSURES) 


FIGURE  18 

HEAT  TRANSFER  INSTRUMENTATION  (NGV's  AND  ROTOR) 
ALTERNATE  TECHNIQUES  AVAILABLE 

A.  BLADE  CALORIMETER  APPROACH 

•  USE  BLADE  AS  CALORIMETER 

•  MEASURE  TEMPERATURE  WITH  THIN  FILM  THERMOCOUPLES 

•  USE  3-D  TRANSIENT  HEAT  TRANSFER  CODE  TO  REDUCE  DATA 
ADVANTAGES 

•  MINIMUM  FLOW  DISTRUPTION 

•  INSTRUMENT  INDUCED  TEMPERATURE  ERROR  LOW 

•  SINGLE  INSTRUMENTATION  MEASURES  HIGH  &  LOW  FREQUENCEIES 
DISADVANTAGES 

•  REQUIRES  3-D  TRANSIENT  HEAT  TRANSFER  CODE 

•  LARGE  CAPACITY  DATA  ACQUISITION  SYSTEM  NECESSARY 

•  CALIBRATION  RELATIVELY  DIFFICULT 

B.  DISCREATE  CALORIMETER  APPROACH 

•  DISCREATE  CALORIMETERS  INSERTED  IN  BLADES 

•  CALORIMETERS  (1  MM  DIAMETER)  THERMALLY  ISOLATED  FROM  BLADE 
ADVANTAGES 

•  SIMPLIFIED  DATA  REDUCTION 

•  EASILY  CALIBRATED 
DISADVANTAGES 

•  DISTURBS  BLADE  BOUNDARY  LAYERS 

•  INTRODUCES  DOWNSTREAM  ERRORS 

•  DIFFICULT  TO  MEASURE  HIGH  &  LOW  FREQUENCIES  WITH  SAME  UNIT 


1*2 

FIGURE  18 

HEAT  TRANSFER  INSTRUMENTATION  (cQNT'd) 

THIN  FILM  HEAT  FLUX  GAUGES  APPROACH 

•  MULTI  LAVER  FLUX  GAUGES  FABRICATED  ON  BLADE  SURFACE 

•  DIFFERENTIAL  TEMP  MEASUREMENT  ACROSS  INSULATOR 
ADVANTAGES 

i  MINIMUM  FLOW  DISRUPTION 

•  INSTRUMENT  INDUCED  ERROR  LOW 

•  Direct  readout  of  heat  flux 

DISADVANTAGES 

•  MUST  BE  VERY  THIN  FOR  HIGH  FREQUENCY  RESPONSE  (1  TO  5uM) 

•  DIFFICULT  TO  FABRICATE 
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FIGl'PE  19 

THIN  FILM  HEAT  FLUX  GAUGE 


DESIGN  -  MEASURE  TEMPERATURE  DIFFERENCE  ACROSS  INSULATOR 

-  SENSORS  ARE  TEMP.  SENSITIVE  RESISTORS 

CONSTRAINTS  -  FREQ.  RESPONSE  LIMITS  INSULATOR  THICKNESS 

-  SELF  HEATING  SETS  SENSOR  RESISTANCE 

TRADEOFFS  -  SENSITIVITY  FOR  FREQ.  RESPONSE 

-  RESISTANCE  FOR  TEMP.  COEF. 


*NOT  TO  SCALE 


